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SHORT-TERM RETENTION OF 


INDIVIDUAL 


VERBAL ITEMS! 


LLOYD R. PETERSON anp MARGARET JEAN PETERSON 


Indiana University 


It is apparent that the acquisition 
of verbal habits depends on the effects 
of a given occasion being carried over 
into later repetitions of the situation. 
Nevertheless, textbooks separate ac- 
quisition and retention into distinct 
categories. The limitation of dis- 
cussions of retention to long-term 
characteristics is necessary in large 
part by the scarcity of data on the 
course of retention over intervals of 
the order of magnitude of the time 
elapsing between successive repeti- 
tions in an acquisition study. The 
presence of a retentive function within 
the acquisition process was postulated 
by Hull (1940) in his use of the stimu- 
lus trace to explain serial phenomena. 
Again, Underwood (1949) has sug- 
gested that forgetting occurs during 
the acquisition process. But these 
theoretical considerations have not 
led to empirical investigation. Hull 
(1952) quantified the stimulus trace 
on data concerned with the CS-UCS 
interval in eyelid conditioning and it 
is not obvious that the construct so 
quantified can be readily transferred 
to verbal learning. One objection is 


1 The initial stages of this investigation 
were facilitated by National Science Founda- 
tion Grant G-2596. 


that a verbal stimulus produces a 
strong predictable response prior to 
the experimental session and this is 
not true of the originally neutral 
stimulus in eyelid conditioning. 

Two studies have shown that the 
effects of verbal stimulation can de- 
crease over intervals measured in 
seconds. Pillsbury and Sylvester 
(1940) found marked decrement with 
a list of items tested for recall 10 sec. 
after a single presentation. However, 
it seems unlikely that this traditional 
presentation of a list and later testing 
for recall of the list will be useful in 
studying intervals near or shorter than 
the time necessary to present the list. 
Of more interest is a recent study by 
Brown (1958) in which among other 
conditions a single pair of consonants 
was tested after a interval. 
Decrement was found at the one recall 
interval, but no systematic study of 
the course of retention over a variety 
of intervals was attempted. 


5-sec. 


EXPERIMENT | 


The present investigation tests re- 
call for individual items after several 
short intervals. An item is presented 
and tested without related items inter- 
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vening. The initial study examines 
the course of retention after one brief 
presentation of the item. 


Method 


Subjects —The Ss were 24 students from 
introductory psychology courses at Indiana 
University. Participation in experiments was 
a course requirement. 

Materials.—The verbal items tested for 
recall were 48 consonant syllables with Wit- 
mer association value no greater than 33% 
(Hilgard, 1951). Other materials were 48 
three-digit numbers obtained from a table of 
random numbers. One of these was given to 
S after each presentation under instructions 
to count backward from the number. It was 
considered that continuous verbal activity 
during the time between presentation and 
signal for recall was desirable in order to 
minimize rehearsal behavior. The materials 
were selected to be categorically dissimilar 
and hence involve a minimum of interference. 

Procedure-—The S was seated at a table 
with E seated facing in the same direction on 
S's right. A black plywood screen shielded 
E from S. On the table in front of S were 
two small lights mounted on a black box. 
The general procedure was for E to spell a 
consonant syllable and immediately speak a 
three-digit number. The S then counted 
backward by three or four from this number. 
On flashing of a signal light S attempted to 
recall the consonant syllable. The E spoke 
in rhythm with a metronome clicking twice 
per second and S was instructed to do like- 
wise. The timing of these events is dia- 
grammed in Fig. 1. As E spoke the third 
digit, he pressed a button activating a Hunter 
interval timer. At the end of a preset inter- 
val the timer activated a red light and an 
electric clock. The light was the signal for 
recall. The clock ran until E heard S speak 
three letters, when E stopped the clock by 
depressing a key. This time between onset 
of the light and completion of a response will 
be referred to as a latency. It is to be dis- 
tinguished from the interval from completion 
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PRECALL INTERVALA—L ATENCY —I 
Fic. 1. Sequence of events for a 
recall interval of 3 sec. 
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of the syllable by E to onset of the light, which 
will be referred to as the recall interval. 

The instructions read to S were as follows: 
“Please sit against the back of your chair so 
that you are comfortable. You will not be 
shocked during this experiment. In front of 
you is a little black box. The top or green 
light is on now. This green light means that 
we are ready to begin a trial. I will speak 
some letters and then a number. You are to 
repeat the number immediately after I say it 
and begin counting backwards by 3’s (4’s) 
from that number in time with the ticking 
that youhear. I might say, ABC 309. Then 
you say, 309, 306, 303, etc., until the bottom 
or red light comes on. When you see this red 
light come on, stop counting immediately and 
say the letters that were given at the beginning 
of the trial. Remember to keep your eyes on 
the black box at all times. There will be a 
short rest period and then the green light will 
come on again and we will start a new trial.”’ 
The EZ summarized what he had already said 
and then gave S two practice trials. During 
this practice S was corrected if he hesitated 
before starting to count,,or if he failed to stop 
counting on signal, or if he in any other way 
deviated from the instructions. 

Each S was tested eight times at each of 
the recall intervals, 3, 6, 9, 12, 15, and 18 sec. 
A given consonant syllable was used only 
once with each S. Each syllable occurred 
equally often over the group at each recall 
interval. A specific recall interval was repre- 
sented once in each successive block of six 
presentations. The S counted backward by 
three on half of the trials and by four on the 
remaining trials. No two successive items 
contained letters in common. The time 
between signal for recall and the start of the 
next presentation was 15 sec. 


Results and Discussion 


Responses occurring any time dur- 


ing the 15-sec. interval following 
signal for recall were recorded. In 
Fig. 2 are plotted the proportions of 
correct recalls as cumulative func- 
tions of latency for each of the recall 
intervals. Sign tests were used to 
evaluate differences among the curves 
(Walker & Lev, 1953). At each 
latency differences among the 3-, 6-, 
9-, and 18-sec. recall interval curves 
are significant at the .05 level. For 
latencies of 6 sec. and longer these 





SHORT-TERM RETENTION OF INDIVIDUAL VERBAL ITEMS 





o—03 SEC. 
#46 SEC. 
be #9 SEC 
*-#12 SEC 
lo=015 SEC 
*2 18 SEC 





FREQUENCY 


et tee 


RELATIVE 








f 
4 





Oo A. i A A. A. 
246 lO 12 14 
LATENCY - SEC. 
Correct recalls as cumulative 

functions of latency. 





Fic. 2. 


differences are all significant at the 
01 level. Note that the number cor- 
rect with latency less than 2 sec. does 
not constitute a majority of the total 
correct. These responses would not 
seem appropriately described as iden- 
tification of the gradually weakening 
trace of a stimulus. There is a sug- 
gestion of an oscillatory characteristic 
in the events determining them. 


The feasibility of an interpretation by 
a statistical model was explored by fitting 
to the data the exponential curve of Fig. 
3. The empirical points plotted here are 
proportions of correct responses with 
latencies shorter than 2.83 sec. Parti- 
tion of the correct responses on the basis 
of latency is required by considerations 
developed in detail by Estes (1950). A 
given probability of response applies to 
an interval of time equal in length to the 
average time required for the response 
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under consideration to occur. The mean 
latency of correct responses in the pres- 
ent experiment was 2.83 sec. Differences 
among the proportions of correct re- 
sponses with latencies shorter than 2.83 
sec. were evaluated by sign tests. The 
difference between the 3- and 18-sec. 
conditions was found to be significant at 
the .01 level. All differences among the 
3-, 6-, 9-, 12-, and 18-sec. conditions were 
significant at the .05 level. 

The general equation of which the 
expression for the curve of Fig. 3 is a 
specific instance is derived from the 
stimulus fluctuation model developed by 
Estes (1955). In applying the model to 
the present experiment it is assumed that 
the verbal stimulus produces a response 
in S which is conditioned to a set of ele- 
ments contiguous with the response. 
The elements thus conditioned are a 
sample of a larger: population of ele- 
ments into which the conditioned ele- 
ments disperse as time passes. The pro- 
portion of conditioned elements in the 
sample determining S’s behavior thus de 
creases and with it the probability of the 
response. Since the fitted curve appears 
to do justice to the data, the observed 
decrement could 
fluctuation. 

The independence of successive pres 
entations might be questioned in the light 


arise from stimulus 


1.0 





p= 89fol +99(85)"] 


RELATIVE FREQUENCY 








i i rm i i A 


» #4. 9a mm & 
RECALL INTERVAL- SEC 





Fic. 3. Correct recalls with latencies below 
2.83 sec. as a function of recall interval 





196 


of findings that performance deteriorates 
as a function of previous learning (Under- 
wood, 1957). The presence of proactive 
interference was tested by noting the 
correct responses within each successive 
block of 12 presentations. The short 
recall intervals were analyzed separately 
from the long recall intervals in view of 
the possibility that facilitation might 
occur with the one and interference with 
the other. The proportions of correct 
responses for the combined 3- and 6-sec. 
recall intervals were in order of occur- 
rence .57, .66, .70, and .74. A sign test 
showed the difference between the first 
and last blocks to be significant at the .02 
level. The proportions correct for the 
15- and 18-sec. recall intervals were .08, 
.15, .09, and .12. The gain from first to 
last blocks is not significant in this case. 
There is no evidence for proactive inter- 
ference. There is an indication of im- 
provement with practice. 


EXPERIMENT II 


The findings in Exp. I are compati- 


ble with the proposition that the after- 
effects of a single, brief, verbal stimu- 
lation can be interpreted as those of a 
trial of learning. It would be pre- 
dicted from such an interpretation 
that probability of recall at a given 
recall interval should increase as a 
function of repetitions of the stimula- 
tion. Forgetting should proceed at 
differential rates for items with differ- 
ing numbers.of repetitions. Although 
this seems to be a reasonable predic- 
tion, there are those who would pre- 
dict otherwise. Brown (1958), for 
instance, questions whether repeti- 
tions, as such, strengthen the ‘“‘mem- 
ory trace.” He suggests that the 
effect of repetitions of a stimulus, or 
rehearsal, may be merely to postpone 
the onset of decay of the trace. If 
time is measured from the moment 
that the last stimulation ceased, then 
the forgetting curves should coincide 
in all cases, no matter how many 
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occurrences of the stimulation have 
preceded the final occurrence. The 
second experiment was designed to 
obtain empirical evidence relevant to 
this problem. 


Method 


The Ss were 48 students from the source 
previously described. Half of the Ss were 
instructed to repeat the stimulus aloud in time 
with the metronome until stopped by E giving 
them a number from which S counted back- 
ward. The remaining Ss were not given in- 
structions concerning use of the interval be- 
tween E’s presentation of the stimulus and 
his speaking the number from which to count 
backward. Both the ‘‘vocal’’ group and the 
“silent’’ group had equated intervals of time 
during which rehearsal inevitably occurred in 
the one case and could occur in the other case. 
Differences in frequency of recalls between 
the groups would indicate a failure of the un- 
instructed Ss to rehearse. The zero point 
marking the beginning of the recall interval 
for the silent group was set at the point at 
which E spoke the number from which S 
counted backward. This was also true for 
the vocal group. 

The length of the rehearsal period was 
varied for Ss of both groups over three condi- 
tions. On a third of the presentations S was 
not given time for any repetitions. This 
condition was thus comparable to Exp. I, 
save that the only recall intervals used were 
3, 9, and 18 sec. On another third of the 
presentations 1 sec. elapsed during which S 
could repeat the stimulus. On another third 
of the presentations 3 sec. elapsed, or suffi- 
cient time for three repetitions. Consonant 
syllables were varied as to the rehearsal 
interval in which they were used, so that each 
syllable occurred equally often in each condi- 
tion over the group. However, a given syl- 
lable was never presented more than once to 
any S. The Ss were assigned in order of ap- 
pearance to a randomized list of conditions. 
Six practice presentations were given during 
which corrections were made of departures 
from instructions. Other details follow the 
procedures of Exp. I. 


Results and Discussion 


Table 1 shows the proportion of 
items recalled correctly. In the vocal 
group recall improved with repetition 
at each of the recall intervals tested. 
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TABLE 1 


PROPORTIONS OF ITEMS CORRECTLY 
RECALLED IN Exp. II 


Silent | 


Conditions in the silent group were 
not consistently ordered. For pur- 
poses of statistical analysis the recall 
intervals were combined within each 
group. A sign test between numbers 
correct in the 0- and 3-repetition con- 
ditions of the vocal group showed the 
difference to be significant at the .01 
level. The difference between. the 
corresponding conditions of the silent 
group was not significant at the .05 
level. Only under conditions where 
repetition of the stimulus was con- 
trolled by instructions did retention 
improve. 

The obtained differences among the 
zero conditions of Exp. II and the 
3-, 9-, and 18-sec. recall intervals of 
Exp. I require some comment, since 
procedures were essentially the same. 
Since these are between-S compari- 
sons, some differences would be pre- 
dicted because of sampling variability. 
But another factor is probably in- 
volved. There were 48 presentations 
in Exp. I and only 36 in Exp. II. 
Since recall was found to improve over 
successive blocks of trials, a superior- 
ity in recall for Ss of Exp. | is reason- 
able. In the case of differences 
between the vocal and silent groups 
of Exp. II a statistical test is permiss- 
able, for Ss were assigned randomly 
tothe twogroups. Wilcoxon's (1949) 


test for unpaired replicates, as well as 
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a ¢ test, was used. Neither showed 
significance at the .05 level. 

The 1- and 3-repetition conditions 
of the vocal group afforded an op- 
portunity to obtain a measure of what 
recall would be at the zero interval in 
time. It was noted whether a syl- 
lable had been correctly repeated by 
S. Proportions correctly repeated 
were .90 for the 1-repetition condition 
and .88 for the 3-repetition condition. 
The chief source of error lay in the 
confusion of the letters ‘‘m”’ and ‘“‘n.”’ 
This source of error is not confounded 
with the repetition variable, for it is 
S who repeats and thus perpetuates 
his error. Further, individual items 
were balanced over the three condi- 
tions. There is no suggestion of any 
difference in responding among the 
repetition conditions at the beginning 
of the recall interval. These differ- 
ences developed during the time that 
S was engaged in counting backward. 
A differential rate of forgetting seems 
indisputable. 

The factors underlying the improve- 
ment in retention with repetition were 
investigated by means of an analysis 
of the status of elements within the 
individual items. The _ individual 
consonant syllable, like the nonsense 
syllable, may be regarded as present- 
ing S with a serial learning task. 
Through repetitions unrelated com- 
ponents may develop serial depend- 
encies until in the manner of familiar 
words they have become single units. 
The improved retention might then 
be attributed to increases in 
serial dependencies. The 
proceeded by ascertaining the de- 
pendent probabilities that letters 
would be correct given the event that 
the previous letter correct. 
These dependent probabilities are 
listed in Table 2. It is clear that with 
increasing repetitions the serial de- 
pendencies increase. 


these 
analysis 


was 


Again combin- 
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TABLE 2 


DEPENDENT PROBABILITIES OF A LETTER 
BeInG CorRRECTLY RECALLED IN THE VOCAL 
Group WHEN THE PRECEDING LETTER 

was CORRECT 


Recall Interval (Sec.) 
Repetition 
Time (Sec.) 


3 
i 
0 


ing recall intervals, a sign test be- 
tween the zero condition and the three 
repetition condition is significant at 
the .01 level. 

Learning is seen to take place within 
the items. But this finding does not 
eliminate the possibility that another 
kind of learning is proceeding concur- 
rently. If only the correct occur- 


rences of the first letters of syllables 
are considered, changes in retention 
apart from the serial dependencies 


can be assessed. The proportions of 
first letters recalled correctly for the 
0-, 1-, and 3-repetition conditions were 
.60, .65, and .72, respectively. <A 
sign test between the 0- and 3-repeti- 
tion conditions was significant at the 
.OS5 level. It may tentatively be 
concluded that learning of a second 
kind took place. 


The course of short-term verbal re- 
tention is seen to be related to learning 
processes. It would not appear to be 
strictly accurate to refer to retention 
after a brief presentation as a stimulus 
trace. Rather, it would seem appropri- 
ate to refer to it as the result of a trial of 
learning. However, in spite of possible 
objections to Hull’s terminology the 
present investigation supports his general 
position that a short-term retentive 
factor is important for the analysis of 
verbal learning. The details of the role 


of retention in the acquisition process 
remain to be worked out. 


SUMMARY 


The investigation differed from traditional 
verbal retention studies in concerning itself 
with individual items instead of lists. For- 
getting over intervals measured in seconds 
was found. The course of retention after a 
single presentation was related to a statistical 
model. Forgetting was found to progress at 
differential rates dependent on the amount of 
controlled rehearsal of the stimulus. A por- 
tion of the improvement in recall with repeti- 
tions was assigned to serial learning within the 
item, but a second kind of learning was also 
found. It was concluded that short-term 
retention is an important, though neglected, 
aspect of the acquisition process. 
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VARIATIONS IN CROSS-MASKING WITH 
FREQUENCY ! 


J. G. INGHAM 


Medical Research Council Neuropsychiatric Research Unit, Cardiff, Wales 


The general problem of the masking 
of one pure tone by another was in- 
vestigated intensively by Wegel and 
Lane (1924). Later writers have 
leaned heavily upon this earlier work, 
which indicated that masking only 
occurred when the two tones were 
presented to the same ear. Fletcher, 
for example, answers with a categori- 
cal negative the question: “Does the 
same interfering effect exist when the 
two tones are introduced into op- 
posite ears instead of both being 
introduced into the same ear?” 
(Fletcher, 1953, p. 157). Wegel and 
Lane’s results did indeed suggest that 
the small amount of masking appar- 
ently observed in the opposite ear 
could be accounted for by leakage of 
sound from one ear to the other. 
Zwislocki (1953) refers to a ‘‘physi- 
ological component,”’ which does not 
exceed 5 db, in cross-masking, but 
does not state the results from which 
the existence of such a component is 
inferred. In a previous paper, the 
writer has shown that the phenomenon 
of cross-masking does exist, even when 
the masking tone is not sufficiently 
intense to cause any appreciable 
direct stimulation of the opposite ear 
(Ingham, 1957). It appears likely, 
therefore, that the phenomenon is 
one which involves the brain. 

That relationship 
exists between masking and the fre- 
quency separation of the two tones 


a systematic 


! The writer wishes to express his appreci- 
ation to Derek Richter, Director of Research, 
and to members of the staff of Whitchurch 
Hospital and Cardiff Radiotherapy Centre 
who took part in the experiments. 
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was shown by Wegel and Lane (1924) 
for monaural masking. The object 
of the present investigation was to 
explore cross-masking to see how this 
effect is related to frequency. 

Various hypotheses were put for- 
ward as possible explanations of the 
cross-masking phenomenon (Ingham, 
1957). The present experiments were 
not intended to be critical for any of 
these but they are relevant. This 
point will be discussed later. 


EXPERIMENT | 
Method 


Design.—Using the same masking tone as 
before, that is 400 cy./sec., masking was ob- 
served on test frequencies of 250, 370, 430, 
550, 700, and 1000 cy./sec., presented to one 
ear. In one session, thresholds for all six test 
frequencies were measured. In another ses- 
sion, thresholds for only one test frequency 
were measured, the test frequency being 
either 250, 430, or 1000 cy./sec. These two 
procedures were introduced because during 
preliminary experiments results were ob- 
tained which suggested that there might be 
important differences between them. 

The Ss were paired in order of testing. 
One member of each pair had the one-fre- 
quency session first, followed by the six-fre- 
quency session on another day. For the 
other member of the pair the order was re- 
versed. Orders were allotted at random 
within each pair. Similarly, within each 
successive three sets of pairs, the test fre- 
quencies 250, 430, and 1000 cy./sec. were 
allotted at random. The masking tone was 
always presented to the most sensitive ear, as 
determined by a preliminary approximate 
threshold observation for 400 cy./sec. This 
was done in order to minimize leakage of the 
masking tone to the opposite ear. Test fre- 
quency thresholds for the least sensitive ear 
were measured, first with the opposite ear 
unstimulated and secondly with the 400 cy./ 
sec. masking tone, 30 db above threshold, in 
the opposite ear. Masking was defined as the 
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average difference between the two sets of 
thresholds. 

la the one-frequency session there were 
36 threshold determinations. The first nine 
of these were for practice and they were fol- 
lowed by a rest period of at least 10 min. 
The remaining 27 were divided into 3 series of 
9. Series | was a control series. During 
Series II the masking tone was present in the 
opposite ear. Series II] was introduced to 
detect possible aftereffects. In the six-fre- 
quency session there was a preliminary prac- 
tice series of 6 determinations, one for each 
test frequency in random order, followed, after 
at least 10 min. rest, by two series each con- 
sisting of 18 determinations. Each of these 
series comprised three groups of test fre- 
quencies, the six frequencies being allotted 
at random within each group. There was a 
different random order for each S. The 
masking tone was present in the opposite ear 
during Series II]. This was a fairly long 
session, so Ss were not burdened by a further 
series without a masking tone. In both 
sessions Ss were told that as far as possible 
they were to ignore the masking tone. 

No soundproof room was available, and 
the experiment was conducted in the presence 
of some uncontrolled ambient noise. The 
loudest external sounds (lawn mowers, floor- 
polishers, etc.) were eliminated. The only 
sound coming from the apparatus was a low 
level hum which was constant throughout the 
experimental Other sounds were 
intermittent and seemed to have very little 
effect upon thresholds, judging by the con- 
sistency of observations within each series. 
There is no reason to suppose that variations 
in ambient noise level were related to test 
frequency or that there was any difference 
in noise level between the masked and un- 
masked conditions. 

Threshold measurements.—The test tones 
were generated by an Advance (J-2) Oscillator 
fed to a Brown's Type K earphone through a 
continuously variable Marconi attenuator. 
Thresholds were determined by the method of 
limits using intensities increasing in steps of 1 
db. Only ascending series of intensities were 
used. More details of the method and in- 
structions to Ss for the one-frequency session 
will be found in the earlier article (Ingham, 
1957). It should be emphasized that a con- 
tinuous test tone was used. An ambiguity 
has led some readers of the previous article to 
suppose that the test tone was interrupted. 
The method and instructions were essentially 
the same for the six-frequency session except 
that a different random order of time intervals 
between determinations was prepared for each S. 


session. 
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Masking tone.—This was produced by 
means of a Dawe Wide Range Oscillator fed 
to the other earphone through an attenuator 
which could be changed in steps of 10 db. A 
valve voltmeter measured the input to the 
attenuator. The threshold for the masking 
tone was determined by the method of limits. 
It was defined as the average of three ascend- 
ing and three descending determinations. 
Intensity was varied by the oscillator gain 
control in steps of 0.2 v. During Series II 
the masking tone was set at threshold inten- 
sity and the attenuation then reduced by 30 db. 

Subjects.—The normal group comprised 6 
men and 12 women members of the staff of 
the Cardiff Radiotherapy Centre. The men 
were aged from 23 to 41 and the women from 
17 to 35. <A group of psychiatric patients 
was also used in the experiment with the 
object of investigating the relationship be- 
tween masking and diagnosis (to be the sub- 
ject of a later article). There were 18 men and 
6 women patients, the men ranging in age from 
17 to 62 (6 of them were over 50) and the 
women from 17 to 42. All these patients were 
considered by their psychiatrists to be neu- 
rotics. 


Results 


In Fig. 1 is shown the amount of 
masking in the six-frequency sessions, 
plotted against test frequency. Re- 
sults for the normals and the two 
groups of patients are shown separ- 
ately. The greatest masking occurs 
at the frequencies 370 and 430 cy. 
sec. where the frequency separation 
between masking and 
least. For test tones 
lower frequency the 


test tones is 
of higher or 
masking de- 


creases as the frequency separation 


of the tones increases. A smoothed 
curve fitting these points is similar in 
shape to that reported for two tones 
presented to the same ear (Fletcher, 
1953), though the cross-masking 
curves show less masking throughout 
the whole range.” 

It will be noted that the shapes of 
the curves are similar for all groups. 
In fact the results were even more con- 


2 This comparison should not be taken too 
seriously as the number of Ss is small. 
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sistent than is indicated in Fig. 1. 
Taking Ss in seven groups of six, 
in order of testing, each group showed 
the same peak in the masking curve 
around 400 cy./sec. Because of this 
consistency from one group to an- 
other it may hardly appear necessary 
to do a test of significance. F ratios 
for frequency are significant for all 
three groups. 

Masking in the one-frequency ses- 
sion showed similar variation with 
test frequency, being greatest for the 
430-cycle test tone in both normal and 
neurotic groups. The patients were 
combined into a single group because 
of the small numbers and because it 
would not otherwise have been possi- 
ble to balance for order of sessions. 
The F ratio for variations between 
frequencies in the normal group was 
not significant but that for the neurotic 
group was significant at the 1% level. 
These results were less conclusive 
than those for the six-frequency ses- 
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sion, but it should be remembered 
that they were based upon compari- 
sons between different groups. The 
error variance thus included a com- 
ponent due to interindividual differ- 
ences which was eliminated in the 
six-frequency comparison. 

On the basis of some preliminary 
results, it was suspected that masking 
might be greater with the one-fre- 
quency procedure than with the six- 
frequency procedure. Experiment | 
revealed no evidence of this for 430 
cycles and only weak evidence for 250 
and 1000 cycles. In only one instance 
(normal group, 250-cycles test tone) 
was the difference between the two 
procedures significant (P < .02). 

A comparison of premasking thresh- 
olds under the two different condi- 
tions of measurement showed no 
significant differences between them. 
The normal group had slightly lower 
thresholds during the six-frequency 
session but this was not so for the 
neurotics. 


EXPERIMENT I] 
Method 


This experiment was essentially a repeti- 
tion of Exp. I, using the six-frequency session, 
with two different masking tones, namely 200 
and 1000 cy./sec. The test frequencies were 
170, 230, 550, 970, 1030, and 3900 cy./sex 
Each S took part in two sessions, one for the 
200 and the other for the 1000-cycle masking 
tone. The masking tones were generated by 
an Advance (H-1) oscillator but otherwise 
all the apparatus was the same as before. 
The two sessions took place on different days. 
Three Ss had the 200-cycle session first and 
three had the 1000-cycle first, the order of 
sessions being allotted at random 
cedure in each session was the same as in 
Exp. I, except for the order in which test 
frequencies were presented and the order of 
the time intervals between determinations. 
As before, there were two series of threshold 
determinations. During Series I the oppo- 
site ear was unstimulated and during Series 
Il the masking tone was presented to the 
opposite ear at 30 db above the threshold 
Each series contained three groups of six test 
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frequencies. The order of test frequencies 
within these groups was determined by a 6 X 
6 latin square to balance for order effects 
common to Ss. A different latin square was 
used for each of the three groups of six test 
frequencies in one series. Series I and II 
were presented in the same order for one 5. 
The same set of latin squares was used for 
the 1000- and 200-cycle sessions but each S 
was moved up one row for the 1000. Each 
S thus had a different set of test frequencies 
for the two sessions but taking the group as a 
whole the same set of six orders was used for 
both the 200- and 1000-cycle sessions. The 
time intervals of 15, 30 or 45 sec. between the 
threshold determinations were in the same 
order for all Ss. This order was randomized 
within each set of six determinations and was 
the same for both Series I and II. 

The Ss were six members of the artisan 
staff of the hospital, all men and aged from 
30 to 51. 


Results 


In Exp. I it was noted that consist- 
ent results were obtained with groups 
of six Ss. It was thought that if the 
same general relationship between test 
frequency and masking existed for a 
different masking frequency, six Ss 
would be sufficient to show it. In 
Fig. 2, masking is plotted against test 
frequency for each of these two mask- 
ing tones, 200 and 1000 cy. 
Confirming the results obtained for 
400 cycles there is a peak around the 
masking frequency in both cases. 
In other respects however, the results 
for 1000 cycles differ markedly, both 
from the 200-cycle results and from 
those for 400 cycles in Exp. I. For 
the 1000-cycle masking tone there is 
no evidence of any masking effect at 
all except for the two nearest fre- 
quencies, 970 and 1030, whereas for 


sec. 


200 cycles, the effect seems to extend 
the test fre- 
A test 
tone of 3900 cycles is masked sig- 


over whole range of 


quencies under investigation. 


nificantly more by a masking tone of 
200 cycles than by one of 1000 cycles 
(P < .02). 
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The magnitude of masking for 200 
cy./sec. was not great but it was so 
consistent that the effect was almost 
certainly a real one. Of 36 deter- 
minations using this masking fre- 
quency (6 test frequencies and 6 indi- 
viduals), positive masking occurred 33 
times. Fatigue may have contributed 
to this, but the results for the one- 
frequency session in Exp. I seemed to 
indicate that fatigue effects are not 
likely to be appreciable under the 
conditions of these experiments. 


EXPERIMENT III 
Method 


Because of this difference in the 1000 cy./ 
sec. results, the experiment was repeated us- 
ing this frequency and also another masking 
tone in the same part of the frequency scale, 
namely 840 cy./sec. As in all these experi- 
ments, the intensity of the masking tone was 
30 db above threshold and the masking and 
test tones were in opposite ears. A narrower 
range of test frequencies was used, to fill in 
the gaps left by Exp. II. The test frequencies 
were 600, 760, 920, 1080, 1240, and 1400 cy./ 
Sec The two masking tones were again used 


MASKING TONE 
/ 1000 C/séc 


MASEING TONE 
200 ¢/sec 


soo 1000 4000 
TEST FREQUENCY in C/sec 


Average masking effect for 
tones of different frequencies. 


test 
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in two separate sessions, the order of the 
sessions being alternated among the six Ss 
as before. The two sessions were on different 
days. The Ss were all men of the hospital 
staff, mostly from the engineering depart- 
ment. They were aged from 26 to 53. 


Results 


The results are shown in Fig. 3. 
Once again there is the increase in 
masking around the masking fre- 
quency. On a logarithmic frequency 
scale, masking declines steeply on 
both sides of the masking frequency. 
There is no evidence of any masking 
at frequencies greater than 300 cy. 
sec. above the masking frequency. 
This contrasts with the results for 
both 200 and 400 cy./sec. 


DISCUSSION 


The facts about cross-masking that 
have been clearly established so far are 
as follows: Firstly, the phenomenon exists 
for intensities of masking tone less than 
that required for direct stimulation of the 
opposite ear. Secondly, masking varies 
with the frequency separation of the two 
tones. Thirdly, it appears that 
masking for low frequency masking tones, 
for example the 200 and 400 cy./sec. 
experiments, extends over a wider range 
of test frequencies, at least on the high 
frequency side, than that for 800 cy./sec. 
upwards. There is some indication (see 
Fig. 2) of a low frequency masking a 
higher frequency more than does another 
intermediate frequency. If confirmed, 
this is a puzzling fact which does not fit 
in with the theoretical interpretations 
discussed below. 

Three types of hypotheses seem ten- 
able to explain these results. Assume 
two groups of neurons, M activated by 
the masking tone and T activated by the 
test Assume also ‘that whenever 
Group M alone is activated the masking 
tone is perceived, and whenever Group T 
alone is activated the test tone is per- 
ceived. What happens when the two 
tones are present together? If Groups 
M and T contain no common elements 


also 
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and activity in one is uninfluenced by 
activity in the other, then there is no 
reason why two separate tones should 
not be perceived. There are several 
ways in which interaction between M 
and T could occur, however. 

Mutual inhibition could take place, for 
example. This has been demonstrated 
physiologically in the lower parts of the 
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auditory pathway. Studies of the audi- 
tory nerve and cochlear nucleus by 
microelectrode techniques have shown 
that a neuron fires at its highest rate 
when the stimulated by 
sounds of a particular frequency, the 
“characteristic frequency”’ for that neu- 
ron. The neuron responds to other fre- 
quencies but the rate of firing decreases 
as the stimulus frequency departs from 
the characteristic frequency. This has 
been shown by Tasaki (1954) for the 
auditory nerve, by Galambos and Davis 
(1943) for the cochlear nucleus (first 
thought to be the auditory nerve), and 
by Hilali and Whitfield (1953) for the 
trapezoid body. It been shown 
that in the cochlear nucleus and trape- 
zoid body the response of a neuron unit 
to one tone can be inhibited by sounding 
another tone at time. Both 
tones may evoke positive responses when 
presented singly. Events responsible for 
cross-masking must involve a level in the 
system where the neurons have synaptic 
connections with fibers from both ears. 
There is no reason to suppose, however, 
that the inhibition must be a direct effect 
between groups of neurons at the same 
level. Higher centers may inhibit the 
lower order neurons or even 
organs themselves 
Granit, 1955). 


receptor is 


has 


the same 


the sense 
(Galambos, 1956; 


The second possibility of interaction 
is that M and T may have neurons in 
common. Depending on the number of 
common elements present the two pat- 
terns may be virtually indistinguishable. 
This hypothesis has the advantage that 
it explains not only the existence of 
masking but also its relationship with 
frequency and other features which can 
be observed in the graphs. It is well- 
established that the response threshold 
of a neuron in the auditory pathway 
increases as the frequency of the stimulus 
departs from the characteristic frequency 
(Hilali & Whitfield, 1953). This implies 
that for any given pair of tones, the 
closer the frequencies the more common 
elements will be involved and, therefore, 
the greater the difficulty of distinguishing 
these tones. The range of frequencies 
to which a neuron will respond tends to 
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narrow as the characteristic frequency 
increases (Galambos & Davis, 1943; 
Hilali & Whitfield, 1953). The mechan- 
ical resolving power of the cochlear 
partition increases between 200 and 1000 
cy./sec. in man as was shown by 
Békésy (Békésy & Rosenblith, 1951, p. 
1104). It has already been noted that 
masking effects occur over a narrower 
range of frequencies as the masking fre- 
quencies increase. 

Allanson and Whitfield (1955) have 
suggested that the function of the inhibi- 
tory phenomenon mentioned above may 
be to improve frequency discrimination 
by reducing activity in the overlapping 
zone. In fact, without such an inhibitory 
mechanism patterns of activity produced 
by two neighboring tones could not be 
distinguished from that of a single tone 
of intermediate frequency. 

The third possible explanation derives 
from the fact that neurons tend to fire 
spontaneously as well as by 
stimulation. 
of “noise,” 


external 
There is thus a background 
random activity against 
which T has to be distinguished. The 


organism is faced with a statistical prob- 


lem. If, in the sensory system con- 
cerned, the T group of neurons is more 
active than the other neurons in the 
system, is this because of random vari- 
ations or are the T neurons being acti- 
vated by an external stimulus? In 
other words, do the two sets of activity 
rates come from the same population or 
not? The lowest activity rate in T 
which can be taken as reliable evidence 
of the presence of an external stimulus 
will depend upon the variance of the 
random activity rate and upon the num- 
ber of neurons involved both in T and in 
the rest of the system. Any event which 
increases the variance or decreases the 
number of neurons in the comparison will 
necessitate a higher rate of firing in T 
and therefore a higher stimulus intensity, 
for the test tone to be distinguishable. 
The effect of the masking tone can be 
interpreted in either way. Either the 
neurons comprising M no longer form a 
part of the background population, which 
is thus depleted, or alternatively, the 
activity rate variance is increased. This 
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formulation, first suggested to the writer 
in a publication by Gregory and Cane 
(1955), was stated more fully in an 
earlier article (Ingham, 1957). 

None of these explanations is incon- 
sistent with the evidence but the second 
one (overlapping patterns) offers the 
simplest explanation of most of the facts. 
These ideas are probably greatly over- 
simplified. It will be noted, for example, 
that a pure place theory of hearing is im- 
plied. It is probable that temporal 
factors play their part in pitch discrimina- 
tion also, particulary at lower fre- 
quencies and in the lower sections of the 
auditory pathway (Licklider, 1955). 
It appears, however, that the facts of 
cross-masking so far obtained can be ac- 
counted for without postulating any- 
thing other than an identification of 
pitch with activity in a certain group of 
neurons, regardless of the temporal pat- 
tern in which impulses appear in these 
neurons. 


SUMMARY 


The object of these experiments was to 
examine the differential masking effect of one 
tone upon another in the opposite ear, when 
the frequency separation of the two tones was 
varied. Monaural thresholds for six fre- 
quencies were determined before and during 
exposure to a continuous masking tone of 
constant frequency, 30 db above the thresh- 
old, using a modified method of limits. It 
was found that the masking effect decreased 
as the frequency separation of the tones in- 
creased. Low frequency tones seemed to 
mask a wider range of test frequencies (on 
the high frequency side of the masking tone) 
than masking tones of 800 cy./sec. upwards. 
The results may be explained in terms either 
of mutual inhibition or of overlapping patterns 
of activity. A statistical hypothesis is also 
tenable. 
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PACED MEMORIZING IN A CONTINUOUS TASK 
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It was shown in a previous paper 
(N. H. Mackworth & J. F. Mack- 
worth, 1959) that the memory span 
was severely limited in a complex 
serial task. The Ss could not use 
information about a future decision 
when more than two decisions separ- 
ated this advance information from 
the moment at which it should be 
used. It was also shown that at a 
faster speed of work even less advance 
information could be stored. 

A number of studies have been made 
on the memory span in a continuous 
task. Some of these have been dis- 
cussed in the paper mentioned above. 
For instance, Poulton (1954) showed 
that Ss could only usefully remember 
one item in a 16-choice, double-joy- 
stick task and only two changes in 
position in a harmonic step tracking 
task. 

In another study, Kay (1953) 
showed that in a serial task when a 
response was delayed for four items 
behind its own stimulus, then Ss were 
unable to maintain the serial nature 
of the task at all, i.e., they either re- 
sponded to a previous stimulus or 
they memorized a present stimulus 
for future response, unless the speed 
of presentation was reduced to 4 sec. 
per stimulus. A similar experiment 
has been described by Kirchner (1958). 

It was clear from these experiments 
that memory span in a serial task is 
extremely limited, if rehearsal is pre- 
vented by speed. It was, therefore, 
decided to make a detailed enquiry 
into the relation between performance 

1 The author is indebted to E. C. Poulton, 


who read the manuscript in draft. Statistical 
advice was given by M. Stone. 
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on a serial task and time available for 
each response. Neither self-paced 
nor standard fixed speeds proved satis- 
factory because of the large variabil- 
ity among Ss. Therefore the present 
experiment determined the speed at 
which a criterion of 80% correct 
responses could be achieved. This 
avoided the possibility of alternating 
groups of stimuli with groups of 
responses. 


EXPERIMENT | 
Method 


Task.—The task presented a stimulus, 
which was a capital letter, to which S had to 
respond by pressing a switch marked by that 
letter. When he pressed this switch at the 
right time, he was informed by the onset of a 
light, and a point was scored on a counter. 
He sat in front of a keyboard of 25 lights, each 
labelled with a letter of the alphabet, arranged 
in two rows in alphabetical order. Below 
each light was a press button. Above this 
keyboard, a window showed a series of letters, 
one at atime. The letters were on four con- 
tinuous belts, each holding 50 letters. The 
belt in use was moved on one letter at a time 
by an electric timing device. This also 
moved on a contact which rendered available 
one circuit at a time. These circuits were 
connected to the labelled lights in the same 
order as the letters on the belt. If S pressed 
the button at the time when that circuit was 
in use, the light would come on. The belts 
could be arranged so that they showed a letter 
any required number of ticks on the timer 
ahead of the tick when the circuit for that 
letter was available. 

Subjects.—The Ss were naval enlisted men. 
Five Ss constituted Group 1. In order to test 
a hypothesis suggested by the results from this 
group, five additional Ss were tested as Group 
2. Ten Ss were used for Group 3. 

Stimulus-response relationships.—In the 
practice, S was first instructed to press the 
button corresponding to a letter when that 
letter had just disappeared from the window. 


This represented Advance 1. There was 
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nothing to prevent S pressing the switch 
before its letter had disappeared from the 
stimulus panel, provided that he held it down 
until the next letter had appeared. This was 
demonstrated to him. Thus'no memory was 
involved in this condition. 

Having learned his way around the key- 
board with this condition, Advance 2 was 
now presented to S. In this case he had to 
press the switch corresponding to a letter 
when the next letter but one had appeared. 
Thus he always had to remember one letter, 
which had to be changed with each response. 
Similarly, with Advance 3 he had to remember 
two letters, three letters for Advance 4, and 
four letters for Advance 5. Advance 0 
represented a simple stimulus-response situ- 
ation. The S had to press the button cor- 
responding to a letter before that letter disap- 
peared. 

Design.—Each S was given a single ad- 
vance condition per session. Each S in 
Group 1 had Advances 0, 1, 2, and 3, each 
being given at two sessions. Group 2 re- 
ceived the same advances, but each advance 
was given at only one session. The eight 
sessions (four twice each) were 
given to Group 1 and the four advances to 
Group 2 in random order. Group 3 received 
Advances 2 through 5 twice over. 

Speed.—Each S was given six runs of 50 
letters each in each session. These six runs 
were given at three different speeds, each 
speed being given twice, and the speeds were 
randomized. These speeds were chosen to 
span the 80% correct score, from the approxi- 
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mate estimates derived from the preliminary 
runs. Each S could have a different range 
of speeds from any other S, though in fact 
the range for several Ss often coincided. 
There was often an increase in the average 
speed given to S at one advance between one 
session of that advance and the next. Even 
if there was no change in the average speed 
given, S could still show an increased speed 
of performance at the 80% level. 

Statistical analysis.—A quadratic formula 
was fitted to the performance scores obtained 
at these three different speeds for one session 
for each S. This gave an estimate for the 
speed at which the performance of that S was 
at the 80% correct level. Thus a figure was 
obtained for speed which was on a continuum, 
and so could be used for analysis of variance. 
The Friedman x,* test was also applied to the 
results, after the advances had been ranked 
for each S. 


Results 


It was found that performance at 
the fastest speed chosen in each case 
was always worse than performance 
at the slowest speed, despite random 
order, thus justifying the interpola- 
tion. Table 1 shows the calculated 
speeds for 80% correct performance. 
Both runs are given. Figure 1 shows 
the second runs only for all the ex- 
periments. 


TABLE 1 


SECONDS PER STIMULUS TO AcHIEVE 80% Correct RESPONSES 


Exp.* | Group | | 1 


| Mean | SD | 

11.19 | .08 
85) .05 

11.15] .09 


54! .01 


} | | 
| | i 
83 07 | 55 | .05 


.77| .04 


| 


* Experiment I involved separate letters as stimuli; Exp. II involved labelled lights 


labelled lights. 


Advance 


Mean | Mean 


1.83 
| 1.57 | 


Exp. III involved ur 
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It was found that four Ss were un- 
able to reach the 80% correct responses 
at Advance 5 at any speed presented. 
(The slowest speed was 10 stimuli/ 
min., or 6 sec. per stimulus.) Six Ss 
were able to reach the criterion with 
Advance 5, and the mean result for 
these six is shown in Table 1 and by a 
dotted line in Fig. 1. 

After examination of the data had 
shown that there was no large hetero- 
geneity of variance in the calculated 
speeds at which Ss in Group 1 had 
achieved the 80% performance level, 
analysis of variance was applied. It 
was found that Advances (P < .1%, 
tested against Ss X Advances inter- 
action) and Sessions (P < .1%, tested 
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Fic. 1. Calculated speed of 80% correct 
responses at different advances for the various 
displays used. 
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against Ss X Sessions) both 
significant. 

The rank order of the advances was 
the same for every S in Groups 1 and 
3. This rank order was Advances 1, 
2, 0, 3 when arranged from the fastest 
to the slowest. (The 5 Ss of Group 2 
showed the same rank orders except 
that one S was faster at Advance 2 
than Advance 1.) The significance 
of this rank order was tested by the 
Friedman x,’ test, which rejected the 
null hypothesis that all rankings were 
equally likely for each group (P < 
.1% for each group). 

Thus it is clear that, except for 
Advance 0, the time required by S 
to do the test properly was directl. 
related to the number of items which 
separated the stimulus from its re- 
sponse. 

It was found from Group 1 that S 
could manage Advance 2 slightly 
faster than Advance 0. In order to 
verify the hypothesis that this was the 
case, Group 2 was tested in.a similar 
design (one run per advance only) and 
again each S could work faster at 
Advance 2 than at Advance 0 (P 
< 1%). 


were 


EXPERIMENT Ii 


The previous experiment was de- 
signed to encourage verbal rehearsal. 
It was therefore of interest to deter- 
mine what differences would be pro- 
duced by altering the display so that 


the emphasis was on the 


location of the stimuli. 


spatial 


Method 


Instead of presenting the stimuli as separ- 
ate letters that required search to locate the 
response buttons, the stimuli were now the 
green lights themselves, thus more closely 
resembling the Kay (1953) and Kirchner 
(1958) experiments mentioned. Each light 
was labelled with a letter, as before, so that S 
could remember it by its letter as well as by 
the spatial position. 
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Ten enlisted men were used as Ss in Group 
4; they received Advances 2, 3, and 4. 


Results 


Table 1 and Fig. 1 show the results. 
It can be seen that Ss could work 
faster with Advance 2 when search 
was reduced, but that no increase in 
speed was obtained with Advances 3 
and 4, as compared with the separate 
letter stimulus. 


EXPERIMENT III 


Since Exp. II showed that the 
change of stimulus from separate 
letters to lights next to their response 
buttons improved performance at 
Advance 2 but not at Advances 3 and 
4, it was concluded that verbal mem- 
ory was still the dominant factor at 
these higher advances. Therefore 
Exp. III was designed to remove the 
verbal factor. The hypotheses were 
that, as compared with the letter 


stimuli: (a) increased speed would be 
obtained with Advances 0, 1, and 2, 


where rehearsal was not needed; 
(b) decreased speed would be reached 
with higher advances when verbal 
rehearsal was made more difficult. 


Method 


The green lights were the stimuli, as in the 
previous experiment, but they were no 
longer labelled, so that S had to remember 
them spatially or supply his own names. 
In order that the fastest possible response 
times should be obtained, the button switches 
were replaced by microswitches for Group 6. 
Ten Ss were used for Group 5, which received 
Advances 2, 3, and 4, and five Ss were used 
for Group 6, which received Advances 0, 1 
and 2. 


Results 


The results are again shown in 
Table 1 and Fig. 1. Not one of the 
10 Ss could reach the required score 
of 80% correct with Advance 4, The 
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slowest speed used was 6 sec. per 
stimulus. 

A further calculation was made for 
Advance 0, with the microswitches, 
to find the speed at which 50% correct 
responses were obtained. Since no 
mistakes were made, this gives an 
approximate measure of reaction time, 
because half the responses must have 
been too slow for their permitted 
time. The reaction time ranged from 
.67 to .74 sec., with a mean of .71. 
Such a serial reaction time is likely to 
be longer than reaction times meas- 
ured separately. 

Comparison of the results from 
Group 6 (microswitches and light 
stimulus) with those from Group 1 
(separate letter stimulus) showed that 
Group 6 could respond approximately 
.3—4 sec. faster. 

Comparisons between Exp. I, II, and 
III.—Groups 3, 4, and 5 had the same 
design, so each advance could be 
compared by analysis of variance. 
This showed that the null hypothesis 
could be rejected at the 1% level for 
Advance 2, and at the 5% level for 
Advance 3. Analysis of differences 
between means for Advance 2 showed 
that Group 3 (separate letter stimulus) 
was significantly (P < 1%) slower 
than Groups 4 and 5 (light stimulus 
next to response switch), who did not 
differ from each other. Similarly, 
Group 3 and Group 4 were significantly 
faster than Group 5 in Advance 3, 
but did not differ from each other 
(P < 5%). 

A different comparison was made 
for Advance 4 since no numerical val- 
ues were obtained for Group 5. On the 
basis of whether or not they reached 
the criterion, there was a significant 
difference (P < 1%) between Group 
5, not one of the 10 Ss reaching it, and 
Groups 3 and 4, all of whom did. 
Mean per cent correct responses for 
the speed of 3 sec. per stimulus and 
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Advance 4 were as follows; Group 3, 
86.6% (SD = 9.4); Group 4, 81.7% 
(SD = 12.3); and Group 5, 58% 
(SD = 4.7). 


DISCUSSION 


Advances.—Presenting the stimulus 
one ahead of the required response (Ad- 
vance 1 vs. Advance 0) so that there was 
an overlap between the stimulus and the 
preceding response, enabled Ss of Group 
1 to gain approximately .3 sec. per re- 
sponse. Moreover, the results for Ad- 
vance 0 compared with-Advance 2, where 
S had to remember one stimulus while 
responding to the previous one, show that 
the advantages of foreknowledge were so 
great as to outweigh the disadvantages 
of having to memorize a continually 
changing stimulus. 

However, when Advance 3 was 
reached, where S had to hold two con- 
tinually changing items in the memory, 
the difficulty of the task was so increased 
that it took an S of Group 1! nearly twice 
as long to do the task as it did with 
Advance 1. In comparing Advance 0 
with Advance 3 it can be seen that the 
advantage of foreknowledge was com- 
pletely outweighed by the difficulty of 
remembering two items. 

It can be from Table 1 that 
it took Ss of Groups 1 and 3 approxi- 
mately 1 sec. to remember one letter 
(Advance 2), about .6 sec. longer (i.e., 
1.6 sec.) to deal with two letters between 
a stimulus and its response, and about 
another extra .8 sec. (i.e., 2.4 sec.) to deal 
with three intervening letters. A further 


seen 


increase of about 1.2 sec. was necessary 


for those who succeeded in dealing with 
four intervening letters. This time was 
occupied by S in repeating to himself the 
four, etc., changing items each time. 
The actual task was in a sense the same at 
each advance; S was required with each 
stimulus to memorize letter and 
make one response to a previous letter. 
Yet even when only two letters separ- 
ated the stimulus from the response, the 
majority of Ss required the repetition of 
these two letters each time. It was this 


one 


JANE F. MACKWORTH 


overt repetition which required approxi- 
mately 1 sec. for each letter. 

Conditions—In Exp. I the stimulus 
was a letter seen in a window above the 
keyboard; on the keyboard the response 
buttons lit up their own lights when the 
response was timed correctly. In Exp. 
II these individual green lights were lit 
singly as the stimulus, but they still re- 
tained their identifying labels or alpha- 
betical letters. Thus each response was 
made by the button next to the light 
which had previously been lighted a: 
some time interval determined by the 
instructions. In Exp. I S had to mem- 
orize the letter and then search for it on 
the keyboard. In Exp. II this search 
was avoided, unless S remembered only 
the identifying letter verbally, but not 
the spatial position visually. In Exp. 
III the stimulus was again the individual 
light, but these were no longer labelled 
with letters. 

Comparing Group 6 with Group 1, it 
can be seen that about .4 sec. were saved 
for each response for Advances 0, 1, and 
2, when the position of the response was 
identified directly by the stimulus. It is 
particularly interesting to note that this 
advantage remained the same even when 
Ss had to remember at least one other 
stimulus between each stimulus and its 
response. 

However, when more than one other 
stimulus had to be remembered between 
each stimulus and its response, the posi- 
tion was quite different. It appears 
very likely that one extra stimulus could 
be memorized by its spatial position, 
that is, it was memorized visually, but 
more than one could not be retained in 
this way, but had to be translated into a 
verbal memory. This was relatively 
easy when the lights were directly identi- 
fied by their labels, but very difficult 
when the lights wére not so identified. 
Thus, it is seen that the labelled stimulus 
lights of Exp. II gave very similar results 
to those obtained in Exp. I where separ- 
ate letters were the stimuli, when Ad- 
vance 3 is considered. Advance 4 gave 
even worse results with labelled lights 
than with separate letters. However, 
when the lights were not labelled, as with 
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Group 5 in Exp. III, Ss took markedly 
longer over Advance 3 than in the previ- 
ous two experiments, while not one of the 
10 Ss was able to do the task to the cri- 
terion with Advance 4. (The unlabelled 
lights also increased the differences be- 
tween Ss for advances [cf. SD for Ad- 
vance 3, Groups 3, 4, and 5].) 

Theory.—These results fit in with the 
theory that the memory trace goes 
through two different stages; the first 
stage is a direct perceptual trace, pre- 
sumably some sort of electrical change in 
the appropriate part of the brain, in this 
case the occipital lobe. This trace takes 
perhaps .25 to .5 sec. to record, as judged 
by eye movement studies. It is of very 
brief duration, and presumably sets up 
another trace elsewhere in the brain. 
This secondary trace takes longer to 
establish, perhaps up to 1 sec. or more, 
and may consist of a reverberating circuit 
which is more durable. The experi- 
ments reported in this paper suggest 
that the first trace may be a direct repre- 
sentation of the visual situation, while 
the second trace tends to be a verbal one. 
Thus the first direct trace must be trans- 
lated by the brain into a verbal one to be 
stored in a more durable form. This is 
suggested by the fact that approximately 
1 sec. extra is required for each extra 
stimulus to be remembered. 

A response can overlap a later stimulus 
completely and still come into the high 
speed category of direct response to the 
primary perceptual trace of the previous 
stimulus. Such a complete overlap is 
found in Advance 2, where response to a 
stimulus cannot be begun until the stimu- 
lus has disappeared and the next is 
visible. But when two stimuli intervene 
between another stimulus and its re- 
sponse, as with Advance 3, then it seems 
that the trace must be verbalized, so 
that it becomes sufficiently durable to be 
manipulated mentally, a continual re- 


ordering going on, one being dropped and 
one added each time. 


This verbal trace 
is perhaps recorded as a reverberating 
circuit, which is a unit, and therefore 
must be 
the 


reconstructed each time into 


new unit. This reconstruction in- 
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volves repeating the whole number of 
stimuli to be remembered each time. 


SUMMARY 


An investigation was made to discover the 
number of stimuli that Ss could usefully 
remember in a simple stimulus-response task. 
The stimulus was either an alphabetical letter 
or a light and the response button was labelled 
with the same letter. The problem was to 
press the button at the right time in relation 
to the stimulus. This time was determined 
by the instructions, which required the re- 
sponse to a certain stimulus to be delayed 
until a definite number of later stimuli had 
appeared. In other words, S had toremember 
a continually changing small group of letters. 
The size of the group to be remembered was 
varied from zero to four letters, in addition to 
the one to which response was being made. 
The test was scored by determining the speed 
at which each S could reach an 80% correct 
score. 

It was found that Ss required approximately 
i sec. per stimulus for each member of the 
group which they had to hold in memory 
Thus when the response lay five letters behind 
its stimulus, so that four letters had to be 
remembered, Ss required 4 sec. per stimulus 
to achieve the 80% performance level. How- 
ever, not all Ss could reach this level at this 
Advance 5, and the memory span in such a 
continually changing serial task was thus 
about three to four items, under the condi- 
tions of the experiment. 

Further experiments showed that identify- 
ing letters were not necessary for Advances 
0, 1, and 2, but were necessary for Advances 
3 and 4. These experiments emphasized 
the difference between remembering one item 
at a time and remembering more than one 
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A previous experiment (Garner, 
1954) has shown that when Ss are 
required to make half-loudness judg- 
ments with the psychophysical method 
of constant stimuli, their resultant 
half-loudness values are determined 
almost completely by the range of 
intensities used for the comparison 
tones. In that experiment, three 
different groups of Ss received three 
different ranges of intensities of the 
comparison stimulus, and for each 
group the mean half-loudness value 
was almost exactly at the midpoint 
of the range of comparison intensities 
used. This result suggests that there 
is little or no validity to such judg- 
ments. 

That experiment also showed, how- 
ever, that individual Ss, within a 
given group, differed from each other 
in their judgments considerably and 
reliably. In other words, they ap- 
peared to know what they were doing 
and to do it consistently. These 
facts present the paradox of Ss ex- 
hibiting a high reliability in a judg- 
ment for which there appears to be no 
validity. 

In the previous paper it was sug- 
gested that one possible source of 
differential bias for Ss was in the brief 
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preliminary series which had been 
used. Each S had been given a 
random series of 20 practice judg- 
ments, and there was no recording or 
control over the intensities of these 
preliminary trials other than that 
they be random and within the range 
of intensities to be heard by S in the 
main series. Thus it is possible that 
some Ss consistently judged high with 
respect to their range of comparison 
intensities as a result of having had 
their first experience in the experimen- 
tal situation with stimuli on the low 
side, and conversely for other Ss. 
This explanation is in line with the 
assumption that the total stimulus 
context in the experimental situation 
completely determines the experimen- 
tal result—both for groups of Ss and 
for individual Ss. 

The present experiment was con- 
ducted to determine whether the dif- 
ferences between Ss could be predicted 
and controlled by deliberate manipu- 
lation of the stimuli heard in the 
preliminary series. In brief, three 
different ranges of comparison stimu- 
lus intensities were used again, but in 
addition three different ranges of pre- 
liminary stimulus intensities and three 
different lengths of the preliminary 
series were used. It was expected 
that both the intensities and the 
length of the preliminary series would 
significantly affect the half-loudness 
judgments in the main series of the 
experiment. 


METHOD 


Subjects—The total number of Ss used 
was 135, and all were male college students 
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who were taking the elementary course in psy- 
chology and were required to serve as Ss as 
part of the course. None had previously been 
used in auditory experiments. Each of them 
had an absolute threshold for a 1000 cps tone 
of 15 db SPL (db re .0002 microbar) or less, 
but this threshold was determined after the 
experiment had been run so that the threshold 
determinations would not provide a prior 
stimulus context. In the few cases where the 
threshold was too high the data were dis- 
carded and another S was run. 

Procedure—The procedure is described 
more exactly in another paper (Garner, 1958) 
which was concerned with the very first half- 
loudness judgments made. Briefly, the pro- 
cedure is as follows: Each S was run indi- 
vidually, and when first brought to the ex- 
perimental rooms was given a set of written 
instructions which tolc him that he would 
hear pairs of tones and that he was to judge 
whether the second tone of the pair was more 
or less than half as loud as the first. It was 
emphasized that we were concerned only with 
how the tones sounded to him, not in an ab- 
solute criterion. He was allowed to ask 
questions about the procedure, and when he 
was satisfied, he was seated in a soundproof 
room adjacent to the control room, and a 
headset wired for monaural listening was 
fitted to his head. The S was then asked 
again to repeat the instructions as he under- 
stood them through an intercommunication 
system, and if he was correct, the experiment 
was begun. During the preliminary trials, 
S both wrote his responses on a record sheet 
and called them out, so that E could be sure 
the experiment was going properly. During 
the main series, S only wrote his responses. 
The S was not told that the preliminary series 
had a different function in the experiment 
than the main series. 

These somewhat elaborate instructions 
and precautions were taken to be sure that no 
practice trials other than the experimental 
preliminary series would be needed, since it 
was assumed that the very first stimuli heard 
would have some effect on S. From the time 
that S came to the experimental rooms all 
equipment was turned off so that he would 
not inadvertently hear the stimuli; S heard 
no stimuli until the experiment actually began. 

The actual stimuli used were pairs of 
1000-cps tones, each 1 sec. long and separated 
by 1 sec. Pairs of tones were presented 
every 7 sec. The tones were produced by an 
electronic timing system and filtered through 
a narrow band-pass filter to eliminate tran- 
sient and switching frequencies. The inten- 
sity of the first tone (standard) of the pairs 


was always 90 db SPL for all preliminary 
and main series. The intensities of the second 
tone (comparison) varied according to the 
experimental conditions. 

Experimental conditions.—The experiment 
was conducted as a complete 3 x 3 x 3 
factorial design with five Ss in each of the re- 
sultant 27 experimental groups. The three 
variables which entered into the design were: 
intensity range of the comparison stimuli of 
the main series, intensity range of the com- 
parison stimuli of the preliminary series, and 
length of the preliminary scries. 

The main series was composed of 240 trials 
or pairs of tones to be judged. For one group 
these comparison stimuli ranged from 75 to 
85 db; for another group from 65 to 75 db; 
and for the third group from 55 to 65 db. 
These were the same intensity ranges that 
had been used previously (Garner, 1954). 
Within each of these ranges six different 
actual intensities were used, spaced every 2 
db over the 10 db range. The runs were 
arranged so that at the end of each quarter 
of the main series each intensity had been 
used equally often. In addition, each inten- 
sity followed every other one an equal number 
of times as nearly as was possible for each 
quarter. 

The second variable was the intensity of 
the preliminary series. Each of the three 
groups defined by the main series intensity 
was divided into three subgroups, and each 
of these three subgroups received preliminary 
intensities which were high, medium, or low 
with respect to the main series intensities 
which were to be received. For the first 
comparison stimulus received, Ss who were 
later run with the 75-85 db range received 
85, 80, or 75 db. The 65-75 db group re- 
ceived 75, 70, or 65 db, and the 55-65 db 
group received 65, 60, or 55 db for the first 
comparison stimulus in the preliminary series 
The first two preliminary intensities used were 
always the values just stated. As the pre- 
liminary series continued, the intensities of 
the comparison stimuli varied randomly in a 
range of 2 db on each side of this stated in- 
tensity. Thus, for example, the group re- 
ceiving 75-85 db main series intensities and 
the high preliminary intensities actually 
received values at 83, 84, 85, 86, and 87 db 
This arrangement means that 15 Ss each 
received preliminary intensities around 85, 
80, 70, 60, and 55 db, and 30 Ss each received 
preliminary intensities around 75 and 65 db. 

The third variable was length of the pre- 
liminary series. Again each of the nine 
groups was divided into three subgroups, and 
one subgroup received 2 preliminary trials, 
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TABLE 1 
ANALYSIS OF VARIANCE OF 
MAIN SERIES RESULTS 
(Score per S is the mean number of “more- 
than-half-as-loud”’ responses per quarter 
of the main series.) 


MS | 


Source df 

Main intensity 

(1) 1325.38 
Preliminary 

length (L) 
Preliminary in- 

tensity (P) 
xh 
1xP 
_xP 
IxLxP 
Within 


97.28 


15.38 
39.83 | 
48.61 
81.06 
92.32 
55.85 


one subgroup received 10, and the third sub- 
group received 20 preliminary trials before 
shifting to the intensity range of the main 
series. 

The Ss were assigned to 27 different ex- 
perimental conditions in a predetermined 
order in which each successive block of 27 Ss 
was assigned one to each experimental condi- 
tion in a random order. 


RESULTS AND DISCUSSION 
Main series results—A complete 
analysis of variance was carried out on 
the results of the main series of judg- 
ments, in which the number of ‘‘more- 
than-half-as-loud,”” or positive, re- 
sponses per S was the individual score. 
This analysis, summarized in Table 1, 
showed that the main source of vari- 


ance was the “within” or error term. 


The main series intensity was a sig- 
nificant variable, but the other two 


variables and all interactions were 
nonsignificant. The expectation had 
been that the main series intensity 
would have no over-all effect on the 
number of positive judgments, since 
that was the result of the previous 
experiment. In addition, however, it 
was expected that low preliminary 
intensities would increase the number 


of positive responses while high pre- 
liminary intensities would decrease 
them, and that this variable would 
interact with length of the preliminary 
series such that the longer the pre- 
liminary series the more pronounced 
the effect. 

The results show, of course, a com- 
plete failure of these expectations. 
Except for the main series intensity, 
no other source of variance was even 
near significance. In fact, most of the 
variance estimates provide rather good 
illustrations of the null hypothesis in 
action. 

The nature of the one significant 
variable is illustrated in Fig. 1, where 
cumulative frequency distributions 
for the 45 Ss in each of the main inten- 
sity groups are shown. The median 
percentages of positive responses are 
41.3, 49.6, and 52.9, respectively, from 
the lowest to the highest intensity 
range. The equivalent means are 
35.9, 48.4, and 53.2%; the discrep- 
ancies between the means and the 
medians are due to the moderate 
skewness in each of the distributions. 
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Fic. 1. Cumulative distributions of the 
‘more-than-half-as-loud” judgments in the 
main series. Each curve is the cumulative 
distribution for the 45 Ss whose main series 
intensities ranged between the values indi- 
cated for each curve. The bottom abscissa 
shows the actual total number of responses 
made by each S; the top expresses the same 
data in percentage form 
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While the previous experiment had 
shown no significant differences due to 
the main series intensity, the outcome 
of this experiment is not drastically 
different from that one, since none of 
the percentages is very far from 50, the 
value which would indicate complete 
response to the context of comparison 
intensities. In fact, the results of the 
present experiment seem more rea- 
sonable, since it seems unlikely that 
the over-all range of intensities should 
have no effect on the number of too 
loud responses. One possible source 
of the difference in the two experi- 
ments may be in the nature of the Ss, 
since in the previous experiment many 
were high school students and several 
were women. It would be interesting 
to determine whether there is a sex 
difference relating to this particular 
type of suggestibility. 


An analysis also was carried out 
to determine whether there was any 
shift in positive responses during the 


course of the experiment, and a 
significant increase was found from 
the first to the second half of the ex- 
periment. The over-all mean _per- 
centage of positive responses increased 
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Fic. 2. The relation between half-loud- 
ness judgments and intensity for different 
numbers of trials. The number of trials is 
indicated inside the graph. For each curve 
the number of Ss per point differs since differ- 
ent Ss received different numbers of prelimin- 
ary trials. The intensity of the standard 
tone was 90 db, and that of the comparison 
tone is indicated on the abscissa. 
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from 44.2 to 47.4, but this increase was 
the same for all three intensity groups. 
In other words, there was a general 
tendency for positive responses to 
increase during the second half of the 
experiment, but this tendency was 
unaffected by the actual intensities 
being used. 

Since the results of the over-all 
analysis had been so fruitless, one 
further analysis was carried out to 
determine whether the effects of the 
preliminary series might have in- 
fluenced the earlier judgments but 
not the later ones. A complete 
analysis of variance was carried out in 
which only the responses made during 
the first quarter of the final series were 
used. The results of this analysis 
were essentially the same as those of 
the complete analysis. In fact, the 
actual variance components were al- 
most identical. 

It seems clear, therefore, that the 
attempt to influence the judgments in 
the main series by deliberate manipula- 
tion of the preliminary intensities was 
completely futile. There remain large 
individual differences, but these dif- 
ferences are uninfluenced by experi- 
ence during the preliminary trials. 

Preliminary series results.—The re- 
sponses of the Ss during the prelimin- 
ary series are of some interest in them- 
selves, since they indicate how early 
the context effect operates. The 
results with respect to the very first 
stimulus have been previously pub- 
lished (Garner, 1958), but further 
analysis is worthwhile. Figure 2 
shows the percentage of positive 
responses during the preliminary trials 
themselves. Each of the curves is for 
a different total number of preliminary 
trials, and for each curve all of the 
available data have been used. For 
example, when 20 preliminary trials 
were plotted, only the groups which 
actually had that many preliminary 
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trials could be used, so that only a 
third of the total of the 145 Ss were 
used for that function. When 10 
preliminary trials were plotted, then 
groups which received either 10 or 20 
preliminary trials were used. When 
one or two preliminary trials were 
plotted, data were usable from all Ss. 
When the preliminary data are used, 
the final intensity is of no considera- 
tion since it had never been used. 
Thus the data are plotted with re- 
spect to the actual intensity used 
rather than the preliminary intensity 
relative to the final intensity. 

These functions show a highly sig- 
nificant relation between positive 
responses and intensity when only the 
first trial is used. By the end of the 
second trial, however, the context 
effect has already begun to operate, 
and the curve is slightly flattened to 
show less effect of intensity. This 
early flattening occurs primarily with 
respect to the more intense stimuli, 
where the percentage of positive re- 
sponses is lowered. This effect sug- 
gests that Ss had more confidence 
about their judgments when they gave 
a “too soft’’ response than when they 
gave a ‘‘too loud”’ response. By the 
end of 10 trials there is considerably 
less differentiation with respect to 
intensity, although the relation be- 
tween intensity and number of posi- 
tive responses is still significant. At 
the end of 20 trials, however, the 
relation is so flat that the correlation 
between intensity and number of 
positive responses (+.23) is not 
significant. 

The fact that the context effect 
seems to operate first with the more 
intense comparison stimuli means that 
the over-all percentage of positive 
responses at first decreases. As the 
context effect then begins to operate 
over the entire range of stimuli, the 
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percentage again rises. For 1, 2, 10, 
and 20 preliminary trials the over-all 
percentages of positive responses are 
31.9, 23.3, 25.1, and 35.0, respectively. 

It is clear from these data that the 
context effect begins to operate almost 
immediately, and that even the second 
comparative judgment is influenced by 
what S heard the first time. Thus 
data from this type of experiment 
which involve even a very few judg- 
ments cannot be considered unbiased 
by E’s selection of stimuli for S to 
hear. It is particularly interesting 
that the context effect operates most 
rapidly at the higher intensities in the 
region where Stevens (1955) suggests 
the true half-loudness value will be 
found. From these data, for example, 
an over-all half-loudness value com- 
puted from just the first judgments is 
12.4 db attenuation from the standard 
of 90 db. By the second judgment, 
this value has changed to approxi- 
mately 8 db attenuation from the 
standard. Does an obtained value 
this low therefore mean that the true 
half-loudness value is this low, or does 
it simply mean that the context 
effect has been operating? 

Individual differences. 


This exper- 
iment was designed to attempt to 
reduce the extent to which the large 


and reliable individual differences 
must simply be attributed to chance 
or unknown factors. Clearly the 
experiment did not succeed in this 
goal. The individual differences re- 
main the largest single source of 
variance, and they cannot be at- 
tributed to either of the critical pre- 
liminary series variables. Figure 1 
shows the extent of these individual 
differences. The reliability of the 
differences is quite good: the correla- 
tion between the first and second 
halves of the main series is .72. This 
value is the mean of three correlations, 
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Fic. 3. Correlation between preliminary 
and main series scores as a function of the 
number of preliminary trials. The abscissa 
indicates the number of preliminary trials 
used to compute the correlation coefficient; 
the symbols on the graph indicate three differ- 
ent groups in terms of the number of prelimin- 
ary trials actually run. The “x” indicates 
the correlation between the main series scores 
and the last 10 preliminary trials for the ‘‘20" 
group. In all cases partial correlation co- 
efficients have been used, with intensity of 
the main series partialled out. 


one for each of the main series inten- 
sity groups. The correlation was 
computed in this form since main series 
intensity was a significant source of 
variance and the pooling of the groups 
would have produced an inflated and 
spurious correlation. 

Thus the paradox still remains. 
The over-all effect of the intensities 
of the comparison stimuli is small, but 
the individual differences within each 
of the groups are large and very reli- 
able. 

One further analysis can be made 
with these data, however, and that is 
the determination of how soon in a 
series of judgments the responses of a 
given S become stabilized. In order 
to determine this stabilization, correla- 
tions have been computed between 
the total number of positive responses 
on the main series and the number of 
positive responses for various num- 
bers of preliminary trials. Figure 3 


shows how this correlation increases 
with an increase in the number of pre- 
liminary trials. The actual correla- 
tions were computed separately for 
groups differing in number of pre- 
liminary runs, and the curve drawn 
is the average of these various corre- 
lations. For example, with the group 
which received 20 preliminaries, a 
correlation was computed using the 
first of these preliminaries, then the 
first two, the first ten, etc. For the 
group which received only two pre- 
liminary trials, only two of the corre- 
lations could be computed. 

It is clear that the number of final 
positive responses can be predicted 
with increasing accuracy as the num- 
ber of preliminary trials used for 
prediction is increased. There is es- 
sentially no predictability with just 
the first response, a slight bit with the 
first two, but by the end of 20 pre- 
liminaries the correlation is .44, a 
value which is approaching the ulti- 
mate reliability of the main series 
responses. Actually, even better pre- 
dictability is obtained with just 10 
preliminary scores if the last 10 out of 
20 are used instead of the first 10. 
This correlation was .49, which is 
considerably higher than that ob- 
tained with the first 10 preliminary 
responses. Clearly, Ss rapidly stabil- 
ize to their final score level. 


The correlation coefficients shown in 
Fig. 3 are actually partial correlations, 
with the main series intensity being the 
term held This procedure 
was used because both preliminary and 
final responses were correlated with this 
factor, and its inclusion would have led 
to spuriously high correlations. 


constant. 


Since 
the preliminary intensity relative to the 
final intensity was not a significant source 
of variance, it 
partial this 
additional 


was not necessary to 


variable However, 


fact 


out. 


correlations were in 
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computed with both the final and the 
preliminary intensity partialled out, and 
these correlations were not significantly 
different from those shown in Fig. 3, a 
fact which simply confirms the analysis 
of variance resu!ts to the effect that pre- 
liminary intensity does not affect final 
responses. 

It is this fact which is most surprising. 
We have shown that the number of posi- 
tive responses S makes in his first 20 
preliminary judgments is correlated with 
the number he makes in his final series, 
but this relation is unaffected by the shift 
in intensity from his preliminary to his 
final series. For example, 
given S makes more than the average 
number of positive responses when re- 
sponding to the 65-75 db range of final 
comparison intensities. The correlation 
between preliminary and final results 
shows that probably he also made more 
than the average number of positive 
responses during his 20 preliminary 
trials, but the lack of effect of the pre- 
liminary intensity shows that it makes no 
difference whether he made these positive 
responses during the preliminary series 
to an average intensity of 65 db or to an 
average intensity of 75 db. 

What seems to be operating, in view 
of these results, is simply a tendency to 
make a greater or fewer number of posi- 


suppose a 


tive responses, and this tendency per- 
the 
involved change. It 


sists even when actual intensities 


may be that we 
are not dealing with a true stimulus con- 
text effect, but rather with a response 
context effect—or just with a response 
set. Some Ss simply make more positive 


or more-than-half-as-loud responses than 


others, and this tendency is quite inde- 


pendent of the actual intensities used. 
This fact leads to the suspicion that the 
S who responds with a large number of 
positive responses with one set of final 
comparison intensities would have done 
the same for any other set of final in- 
tensities. If this is so, then it should 
be possible to introduce much greater 
shifts in intensities and still maintain the 
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correlation of number of positive re- 
sponses. 

In this connection, it is interesting to 
note that Cronbach (1946) has pointed 
out the existence of response sets with 
objective tests, and he states that re- 
sponse sets have the greatest influence on 
performance in ambiguous or unstruc- 
tured situations. These results, as well 
as those of the previous experiment, make 
it eminently clear that experiments re- 
quiring half-loudness judgments more 
than qualify as ambiguous and unstruc- 
tured situations. When 
operate as strongly as they appear to in 
this type of experiment, we have little 
choice but to question the validity of 
such judgments. 


response sets 


SUMMARY 


An experiment was conducted in which Ss 
were required to make half-loudness judg- 
ments with a method of constant stimuli 
Three variables were used in the experiment: 
(a) the range of comparison 
stimuli in the main series of judgments, (6 
the range of intensities in a preliminary 
series of judgments, and (c) the length of the 
preliminary series. A complete 3 x 3 xX 3 
factorial design was used. The 
showed : 


intensities of 


results 


(a) Only the intensity of the final series 
was a significant variable in determining 
number of ‘‘more-than-half-as-loud"’  re- 
sponses. 

(6) During the preliminary series the con- 
text effect develops very rapidly so that by 
the end of 20 judgments there is little relation 
between intensity and number of positive 
responses. 

(c) The Ss are highly reliable in their re- 
sponses even though they are so little affected 
by the experimental! conditions, and this re- 
liability is already in evidence by the end of 
10 preliminary trials. 

(d) It is suggested that since the reliability 
is quite independent of comparison intensities, 
it is produced primarily by a response set 
rather than by a true stimulus content. 
This fact in itself adds strength to the sup- 
position that half-loudness judgments are of 
dubious validity, since response sets operate 
most strongly with ambiguous 
situations 


stimulus 
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One of the most widely accepted 
explanations of reasoning error is the 
“atmosphere effect’? advanced by 
Woodworth and Sells (1935) and by 
Sells (1936) to account for patterns 
of error in syllogistic tasks. The 
atmosphere effect receives favorable 
mention in such well-known text- 
books as those by Underwood (1949) 
and Woodworth and _ Schlosberg 
(1956), and Handbook 
(Miller, 1951). 

Although Sells concluded his article 
with the caution, ‘‘The 
tained, far from possessing finality, are 


in Stevens’ 


results ob- 


much rather a point of departure for 
further researches in this field,”” such 
further research has not been done 
and his conclusions have often been 
accepted without question. 

The present paper will present a 
re-examination of the atmosphere 
effect. The design of Sells’ study 
will be scrutinized, a new study will 
be presented, and inferences will be 
made concerning the preferred errors 
An attempt 
will be made to offer an interpretation 
other than atmosphere effect to ex- 
plain the error patterns. 


in syllogistic reasoning 


' The preparation of this study for publica- 
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The authors are indebted to Warner Wick 
of the University of Chicago Philosophy 
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earlier draft of the manuscript, and for his 
suggestions on the interpretation of the results. 
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We must first define the terms used to de- 
scribe syllogisms. The terminology is a uni- 
form one, based on long tradition. For de- 
tails see any introductory logic textbook, 
such as Cohen and Nagel (1934). The fol- 
lowing summary, however, presents sufficient 
information for purposes of this paper. 

There are four forms of categorical pro- 
positions used in syllogisms: 

Name Expression Symbol 
Universal 


affirmative All S's are P's 


Universal 


negative No S's are P’s 


Particular 


affirmative Some S's are P’s 


Particular 


negative Some S's are not P’s O 


A syllogism consists of three such state- 
ments, the first two of which are the premises. 


1. The major premise, which states the 
relation between the middle term and the 
predicate of the conclusion. 

2. The minor premise, which states the 
relation between the middle term the 
subject of the conclusion 

3. The conclusion which is the inferred, or 
deduced, relation between the term 
and the major term. 


and 


minor 


The figures of the syllogism are the four 
possible arrangements of the terms in the 
major and minor premises, in which S is the 
subject of the conclusion, P the predicate of 
the conclusion, and M the middle term. 


Fig. I Fig. II Fig. III 


M—P P—M M P 
S—M M—S 


S P S 
Fig. IV 
P M 
M—S 


(Major premise 
(Minor premise) 
> 


P 


(Conclusion) 
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The mood of a syllogism refers to the com- 
bination of three propositions from among the 
four kinds of categorical propositions, stated 
in the order of major premise, minor premise, 
and conclusion. For example, a syllogism of 
the AII mood, in the first figure, is as follows: 


All M’s are P’s. 
Some S's are M's. 
Therefore : 

Some S's are P’s. 


This is a valid syllogism. 

Each of the three propositions of a syllo- 
gism could be an A, E, I, or O. Considering 
the major and minor premises only, there are 
16 such possibilities of combination. Four- 
tee. of these 16 yield no valid conclusion in 
one or more of the four figures. 

Sells (1936) gave students 169 syllogisms, 
of which 127 were invalid, and asked them to 
decide the truth or falsity of each. The Ss 
could mark each syllogism as “absolutely 
true,” “probably true,"’ ‘‘indeterminate,"’ or 
“absolutely false." For purposes of scoring, 
Sells considered the ‘‘absolutely true’’ and the 
“probably true’”’ to be agreement, and con- 
sidered the “‘indeterminate’’ and the “ab- 
solutely false’ to be disagreement with the 
item. Using this scoring system, he calcu- 
lated the percentage of Ss who agreed with 
the conclusion of each invalid syllogism. 
He found that error scores were very high, 
the incorrect agreements running between 2% 
and 80%. He also found that the type of 
invalid conclusion which was most often 
accepted varied a great deal among the differ- 
ent pairs of premises. 

He asserted that the pattern of the error 
preferences could be explained by the atmos- 
phere effect. The atmosphere effect has 
been formulated in two different ways. 

In their original formulation, Woodworth 
and Sells (1935) describe the atmosphere 
effect in syllogistic reasoning as a drawing of 
conclusions on the basis of the global impres- 
sion of the premises. Thus an affirmative 
premise, i.e., ‘‘all are’’ or ‘some are” (A or 1) 
produces an affirmative atmosphere, and a 
negative premise, i.e., ‘‘none are”’ or ‘some are 
not”’ (E or O) produces a negative atmosphere. 
A universal premise (‘‘all are” or “‘none are’’) 
produces a universal atmosphere, and a par- 
ticular premise (“‘ 
produces a 
presents 


‘ 


some are”’ or ‘‘some are not’’) 


Sells 


atmosphere 


particular atmosphere. 
two subprinciples of 
effect as self-evident. 


bination of a 


These are: (a) a com- 
universal and a_ particular 


premise produces a particular atmosphere ; (5) 


a combination of an affirmative premise with a 


negative premise creates a negative atmos- 
phere. It should be noted that these rules 
predict a conclusion for one kind of premise 
pair which has a wording other than that of 
either of the premises, i.e., for a universal 
negative combined with a particular affirma- 
tive, they predict a particular negative. 

In addition to the atmosphere effect, 
Woodworth and Sells also used a principle of 
“caution” to explain the pattern of error pref- 
erences. ‘“Caution’’ is a tendency to accept 
weak and guarded conclusions rather than 
strong conclusions. This means acceptance 
of “some are’’ conclusions more readily than 
“all are’’ conclusions, and not” 
conclusions more readily : 
conclusions. 


“some are 


than “‘none are’ 


In a second article, Sells re-formulated the 
atmosphere effect to incorporate this principle 
of caution. Thus he states (1936, p. 34) that 
the conclusions favored by atmosphere for 
AA premises is A, or the weaker I. Using the 
revised atmosphere effect he reported 100% 
success in predicting the preferred errors in 
all moods. However, Sells does imply in a 
footnote (1936, p. 36) that the incorporation 
of caution into atmosphere may not be justi- 
fied. 

Sells found that for the 16 possible paired 
combinations of the four kinds of premises, 
acceptance of I conclusions always exceeded 
A; acceptance of O conclusions exceeded E in 
all except one borderline case; and either I or 
O was the preferred error for all but one of the 
16. His formulation of atmosphere effect 
was advanced as accounting for these error 
preferences. 


However, the nature of his test 
format might be expected to dictate 
high scores on I or O. If an A 
answer is accepted (e.g., all A’s are 
B), logically, an I must also (e.g., 
some A’s are B), and similarly if an E 
answer is accepted, an O must be also. 
Therefore, if Ss were self-consistent on 
Sells’ test, all those Ss who regarded 
an A or E conclusion as acceptable for 
a given premise pair would, neces- 
arily, also regard as acceptable the | 
and O conclusions respectively, when 
these were offered. Thus I or O 
acceptances should never be smaller 
than A or E on this true-false format, 
and would be expected to be larger. 
Thus it appears that some of Sells’ 
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findings might be an artifact of his 
test format, rather than being at- 
tributable to “atmosphere.” 

On the other hand, if these error 
patterns are primarily accounted for 
by “atmosphere,”’ then they ought 
also to be found in a multiple-choice 
format in which S is given two prem- 
ises and can choose from among the 
various possible conclusions on a single 
item. This is the method of the 
present investigation. 

There is an additional source of 
confusion in evaluating Sells’ find- 
ings. Inspection of his items and 
their designations reveals that in the 
designation of the mood for 57 of his 
invalid syllogisms he differed from 
the conventional use of the term, al- 
though he did not indicate to the 
reader that he did so. This discrep- 
ancy consisted of labeling the mood in 
terms of the order of presentation of 
the premises, ignoring the distinction 
between major and minor premises. 
He also labeled figure ignoring this 
distinction in 27 items, and did not 
clearly indicate this.’ 

Sells (1936, Appendix A) stated 
that he had used invalid syllogisms in 
figures and moods which by conven- 
tional designation are valid, such as 
The 


test items which he listed as invalid 


EIO which is valid in all figures. 


were invalid, but were in moods and 
figures other than those he indicated. 
This makes it impossible to determine 
if the error preferences might be re- 


lated to the logical status of the syl- 


logisms. 


*In describing his item designations he 
stated (p. 59) that he labeled “figure of the 
syllogism (with respect only to position of the 
middle term),”” but on the same page illus- 
trated the meaning of “position of the middle 
term” by the usual diagrams which show the 
traditional relationships and order of the 
major, middle, and minor terms. 
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METHOD 


Materials.—A_ syllogism test‘ was con- 
structed which consisted of 42 experimental 
items and 10 filler items, each containing two 
premises and five alternative conclusions, 
e.g., 

Some L’s are K’s 
Some K's are M's 
Therefore : 
1. No M’s are L’s. 
2. Some M’s are L’s. 
. Some M’s are not L’s. 
. None of these. 
. All M’s are L’s. 


The correct answer for all 42 experimental 
items was “‘none of these."’ 

Like Sells, we used statements only about 
letters, in order to avoid the influence of previ- 
ous knowledge and opinions on error prefer- 
ence. 

Three syllogisms were constructed for each 
of the 14 combinations of premises for which, 
in one or more figures, no conclusion is possi- 
ble. For those pairs of premises for which no 
conclusion followed in more than one figure, 
the different figures were used. A complete 
listing of the figures and moods is shown in 
Table 1. 

In addition to the 42 experimental items 
for which a valid conclusion was not possible, 
there were 10 items for which a valid conclu- 
sion reached. These were filler 
items, included to prevent Ss from discovering 
that none of the experimental items had a 
valid conclusion except “‘none of these.” 

The five alternative conclusions were as- 
signed randomly to the five positions, with 
the restriction that each alternative appeared 
equally often in each position. 

Instructions.—The instructions gave an 
example of an AA-first figure syllogism (for 
which either A or | is a valid conclusion) and 
specified the task as marking the correct 
alternative from among the five choices. 

In addition, the instructions specified that 
choosing the alternative, ‘‘None of these,” 
would mean that none of the four other 
alternatives is correct. Following the lead of 
Sells, the instructions discussed the meaning 
of the word “some,” as in the expression 
“Some X’s are Y’s.’’ The word “some” was 
to mean “at least some’’ and would not, in 
itself, necessarily mean that some X’s are not 
7s 

Subjects—The Ss were 222 
introductory psychology classes at 


could be 


students in 


North- 


4 A copy of the test may be obtained from 
the authors. 
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western University who stated in response to 
a question that they had not dealt with syl- 
logistic reasoning in a college course. 


RESULTS 


The percentage of Ss who choseeach 
alternative is listed for the various 
combinations of premises and figures 
in Table 1. As seen there, accuracy 
scores were very low, the mean ac- 
curacy score being 20%. The sheer 
number of errors may be in part at- 
tributed to the fact that students do 
not expect a test to consist chiefly of 
problems with no solution. How- 
ever, this does not predict preference 
for one wrong solution over another. 
There was a marked piling up of 
choices on one of the four erroneous 
conclusions for all of the kinds of 
premise pairs except EO and OE, on 
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which the error preference split be- 
tween two alternatives, O and E. 
For every item, the distribution of 
error preference departed from chance 
(P < .001), as shown by chi square. 
The preferred error varied greatly 
among the 14 kinds of pairs of prem- 
Unfortunately, there is no 
readily available statistical technique 
for determining the significance of the 
differences in error preference between 
the various items and groups of items. 
However, for most of the relevant 
comparisons the differences are so 
massive and clear-cut that a signifi- 
For most 


ises. 


cance test is not required. 
items, a single erroneous alternative 
was chosen by a majority of Ss. 
Moreover, there is a striking consist- 
ency for all items using the same type 
of premise pair; at the same time 


TABLE 1 


PERCENTAGE OF 


16 
12 


8 
18 


19 
20 


* N designates choosing the alternative “ 


Ss Cnoost 


None of these.” 


NG EacH ALTERNATIVE * 


Item Prem 
No ises Fig 


OE 
OE 
OE 


00 | Ill 
OO | IV 
00 II 


The other designations are explained in the text 
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there are striking differences among 
the different kinds, except between 
those which contain the same prop- 
osition forms in differing positions. 
For example, Al and IA premises 
yield quite similar results. Pairs 
of moods containing the same prop- 
osition form have, for the most part, 
the same error preference without re- 
gard for which is the major or minor 
premise ; this fact justifies comparison 
of these results with Sells’ even though 
he did not utilize the distinction. 


DISCUSSION 


Let us consider whether the atmosphere 
effect explains the error preferences 
shown in Table 1. Taking the formula- 
tion as originally advanced by Wood- 
worth and Sells (1935) (i.e., omitting 
the principle of caution), we find that this 
set of principles fails to fit the results for 
IE and OE premises, on both of which 
the preferred error is E while the pre- 
dicted error is O. It also does poorly on 
EO, for which the results show E to be a 
strong rival of the predicted O. 

Sells’ (1936) second and final formula- 
tion of the atmosphere effect (i.e., in- 
cluding the principle of caution) is even 
less applicable to the results. Of the 14 
kinds of pairs of premises, it fails to pre- 
dict not only for the three mentioned 
above, but also for AA, AE, and EE. 
It appears that for these five or six kinds 
of premise pairs, Sells’ finding that the 


particular conclusions were more popular 
than the universal may be attributable 
to his true-false test format. 


Since the atmosphere predictions are 
not substantiated, we must look for 
other principles of explanation. How- 
ever, we will be able to offer only intuitive 
evidence for them. 

In seeking such hypotheses we should 
take into account what we already know 
about reasoning behavior. First of all, 
it is known from the findings of Wilkins 
(1928) and of Sells (1936) that many Ss 
interpret the A and O propositions to 
mean that the converse is also true. They 
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interpret the statement ‘All A’s are 
B’s’’ to mean that “All B’s are A’s,”” and 
the statement “Some A’s are not B’s” to 
mean that “Some B’s are not A’s.” 
(Acceptance of the converse is valid for 
E and I propositions.) 

Such interpretations, although logi- 
cally invalid, often correspond to our ex- 
perience of reality, and being guided by 
experience are usually regarded as justi- 
fiable procedures. One may realistically 
accept the converse of many, perhaps 
most, O propositions about qualities of 
objects; e.g., some plants are not green, 
and also some green things are not plants. 
The reason for this is that when we assert 
one particular negative about A and B 
(some A’s are not B’s), we normally are 
not in a position to assert the universal 
affirmative (all B’s are A’s), that would 
rule out the converse particular negative 
(some B’s are not A’s). In short, ‘Some 
green things are not plants’ would be 
false only if all green things were plants. 

The acceptance of the converse of the 
A propositions is also often appropriate, 
e.g., all right angles are 90°, also all 90° 
angles are right angles. Our students’ 
chief exposure to formally presented de- 
ductive reasoning had been in intro- 
ductory mathematics courses in which 
such reversal of A propositions is usually 
justified because in that context ‘‘are”’ 
means ‘‘are equal to’’ rather than the 
syllogistic ‘‘is included in.”’ 

The principle of accepting the converse 
of A and O propositions would explain 
the results for all of our six kinds of 
premise pairs (the first 18 items of Table 
1) that yield a valid syllogistic conclu- 
sion in at least one other figure (other 
than the figures used here). 

In the remainder of the items, accept- 
ance of the converse of itself yields no 
syllogistic conclusion. For these, an 
additional principle must be found. It 
is known that conclusions are often 
reached by probabilistic inference (Cohen 
& Nagel, 1934; Mill, 1879), and our Ss 
had no way of knowing that all but strict 
deductive reasoning is disallowed in the 
syllogistic game. By one kind of prob 
abilistic inference, S reasons that things 
that have common qualities or effects are 
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likely to be the same kinds of things, but 
things that lack common qualities or 
effects are not likely to be the same. In 
the syllogism, the available common 
characteristic is the middle term. Such 
thinking is not unreasonable; rather it is 
the reasoning process by which most 
science Thus a_ chemist 
might follows: “Yellow and 
powdery material has often been sulphur. 
Some of these test tubes have yellow 
powdery material. Therefore some of 
these test tubes contain sulphur.”’ This 
is an invalid III syllogism of the second 
figure, yet the conclusion has some prob- 
ability. The scientist would regard the 
conclusion as tentative and attempt to 
check it by other means. 


pre gresses. 
reason as 


Probabilistic inference, coupled with 


the acceptance of converses, also ex- 
plains the errors in the remaining premise 
pairs. In the case of an I coupled with 
an O premise in the second figure, e.g., 
“Some A's are B’s, 


B's,’’ S reasons that some A’s and some 


some C’s are not 
C’s do not share the common quality of 
B and therefore some C’s are not A’s. 
For O propositions in which the middle 
term is the subject rather than the predi- 
cate, the proposition must be restated 
to express the converse. Probabilistic 
inference yields analogous results for the 
case of an I with an E. 

In the case of two E premises such as 
“‘No A’s are B’s, no C’s are B's,” the 
middle term, B, is not shared. There- 
fore, by probabilistic inference, no C's 
are A’s. Similarly, two O premises 
yield an O conclusion. However, in the 
latter case the propositions must, when 
necessary, be restated in the converse to 
cast them in the second figure. 

The predicted conclusions by prob- 
abilistic inference for EO and OE are less 
For example, “No A’s are B’s, 
some C’s are not B’s.’” Should we con- 
clude that some C’s are not A’s, or that 
no C’s are A’s? As seen in Table 1, Ss 
split their choices between these two 
alternatives, a result which further sup- 
ports the interpretation that such prob- 
abilistic inference was used. 

The 18 items for which only accept- 
ance of the converse yields a conclusion 


clear. 


have a mean accuracy score of 9%, and 
those for which probabilistic inference 
is necessary have a mean accuracy score 
of 28%. 

We might describe the errors in Table 
1, then, in terms of Ss’ regarding as 
proved something which is merely prob- 
able. If so, they behaving as 
fairly reasonable but incautious people. 
This our impressions 
gained from everyday observations of 
undergraduates. 

Von Domarus (1944) and more re- 
cently Arieti (1955) have suggested that, 
in syllogistic reasoning, concluding that 


were 


corresponds to 


two things are the same because they 
share a common quality is distinctively 
pathological. They say this 
found in schizophrenics but not in 
normals. Clearly, 
contradict their suggestion. If 
writers are reporting a partially valid 
clinical observation, the validity must 
exist in a greater error tendency among 
schizophrenics, or the appearance of the 
error in contexts in which normals would 
not show it. Such a difference of degree 
rather than kind would be 
with other evidence, recently pointed out 
by Chapman (1958), that many aspects 


error is 


the present findings 
those 


consistent 


of the so-called schizophrenic thinking 
disorder 


exacerbations of 
normal error tendencies. 

The principles advanced here to explain 
the error preferences for the 14 types of 
premise pairs are intended to deal only 


consist in 


Phere 
seem to be some other less important 
For 
example, on pairs of premises involving 
two particulars (II, 10, OI, and OO) one 
might expect that A and E conclusions 
would be only random Yet E 
choices are consistently more frequent 


with the main error preferences. 


systematic sources of error choice 


error. 


than A choices. This is probably due to 
the tendency, noted by logic teachers, 
to misinterpret a statement of the form 
“‘No A’s are B’s’”’ to mean that nothing 
has been proved. 

In addition, in most cases where an | 
or O proposition is the preferred error, the 
other of the two propositions is the second 
most preferred error (except for EO and 
OE premises). 
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This error pattern might be attrib- 
uted to the fact that I and O prop- 
ositions imply one another except when 
we are in a position to assert a contra- 
dictory universal. For example, the 
statement ‘‘Some A’s are B’s”’ implies 
that some A’s are not B’s unless we can 
assert that all A’s are B’s. In the 
normal use of particulars, we are not in a 
position to assert this stronger universal. 

The suggestions that we have ad- 
vanced to explain the error preferences 
are tentative ones. We have offered one 
possible solution, but other investigators 
might prefer a different one. 


SUMMARY 


The atmosphere effect in syllogistic rea- 
soning, as advanced by Woodworth and Sells 
(1935) and by Sells (1936), was investigated 
by means of a multiple-choice syllogisms test. 
Marked and consistent error preferences were 
found which did not coincide with those pre- 
dicted by atmosphere. 

The pattern of error preferences was tenta- 
tively ascribed to reasoning behavior which 
often leads to correct solutions of everyday 
problems but which is disallowed in the tradi- 
tional rules of the syllogism. 
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EFFECTS OF DIFFERENTIAL TRAINING ON 
TACHISTOSCOPIC RECOGNITION THRESHOLDS ! 


ROBERT L. SPRAGUE 


Indiana University 


This experiment was an attempt to 
provide data on the relative efficacy 
of oral versus written word frequency 
in determining visual recognition 
thresholds. McGinnies (1949, 1950) 
contended that usage of a word, 
specifically oral usage in addition to 
written usage as emphasized by 
others, is a factor in determining 
tachistoscopic thresholds. Howes 
and Solomon (1950) and Postman, 
Bronson, and Gropper (1953) stressed 
the importance of the written fre- 
quency variable, but Lazarus (1954) 
concurred with McGinnies’ emphasis 
on the oral factor. 


In the absence of oral frequency 


lists comparable to the Thorndike- 
Lorge word counts for written English, 
Solomon (1951) devised a technique 
for experimentally manipulating this 
variable. This technique was sub- 
sequently used effectively by Solomon 
and Postman (1952) and Postman 
and Rosenzweig (1956) on a related 


problem. A variation of Solomon’s 


procedure was used in the present 
study. 


1 This research was done as a thesis in 
partial fulfillment of the requirements for the 
Master of Arts degree at Indiana University. 
The author wishes to thank Arnold Binder 
who served as chariman of his committee and 
Harry Yamaguchi and Leon Levy who served 
as the other members of the committee. 
Parts of this paper were read at the APA 
meeting in Washington, D. C., August 1958. 

This research was supported by Research 
Grant M-1259 from the National Institute 
of Mental Health of the National Institutes 
of Health, U. S. Public Health Service. 


METHOD 


Subjects.—The Ss were 52 student volun- 
teers from psychology classes at Indiana Uni- 
versity. Seven Ss were discarded, leaving a 
total of 45 Ss: 34 men and 11 women. 

Apparatus.—In the training phase, non- 
sense words were presented by means of an 
exposure apparatus previously described 
(Binder & Feldman, 1959). About 14 sec. 
prior to the presentation of the stimulus an 
indicator buzzer sounded, then the stimulus 
appeared for 1} sec., and a 9-sec. intertrial 
interval followed before the next stimulus 
appeared. In the testing phase, a modified 
Dodge mirror tachistoscope, similar to the 
device described by Kupperian and Golin 
(1951), was used to present the stimuli at 
controlled temporal intervals. The timer of 
the tachistoscope permitted a range of ex- 
posure times from .001 to 1.099 sec. in .001- 
sec. steps. 

All the stimulus words used in the experi- 
ment were a combination of two syllables 
taken from Glaze’s (1928) list of nonsense 
syllables of 60% association value. The non- 
sense words were made as dissimilar as possi- 
ble. Five nonsense words (SAHVUL, REB- 
KIC, KESQEL, FETVUB, and NISGOZ) 
were used as core words, i.e., they were pre- 
sented in the test trials, and for every S a 
sample of four of these words was drawn to 
present in the training trials. Four filler 
words (LIMVES, JOHMAV, DASBOH, and 
MESNIF) appeared only once each in the 
training trials and not in the test trials. An- 
other six words (HIZKUS, KUTWUF, 
CEDPOW, RAHFOZ, NADWUN, and 
PABPOG) were used to adapt Ss to the 
tachistoscope before the test trials began. 


Procedure 


Each S was randomly assigned to one of 
three experimental groups: a “pronounce” 
group, a “read” group, and a “‘verbal”’ group. 
The Ss were first brought into the room where 
the exposure apparatus was located for the 
training phase of the experiment. The 
“pronounce” Ss wereJinstructed to read_and 
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pronounce each word which was shown. The 
“read’’ Ss were instructed to read silently to 
themselves each word which was shown with 
the exception that they were told to pronounce 
each new word which appeared in the list the 
first time it appeared and only then. It was 
felt that this latter condition was necessary to 
eliminate uncertainty about the proper pro- 
nunciation of a word which might adversely 
affect their thresholds. The “verbal’’ Ss 
were instructed to repeat aloud each word 
after hearing E pronounce it. All groups were 
told that they would be tested on the words 
after the presentation. Immediately follow- 
ing the training trials, the “pronounce” Ss 
and “read"’ Ss were requested to write all the 
words which they could recall from the list of 
words presented to them. The “‘verbal’’ Ss 
were instructed to repeat every word they 
could remember from the list of words while 
E recorded these words. Thus the “verbal’’ 
Ss never saw the nonsense words in the train- 
ing trials or in the memory test. The non- 
sense words were first presented to them in the 
tachistoscope during the test trials. 

In the training list of 40 words, 36 con- 
sisted of the five core words at the frequencies 
of 0, 1, 5, 10, and 20. The core words were 
counterbalanced in éach experimental group 
so that each core word appeared at each fre- 
quency position an equal number of times. 
The remaining four words in each list were the 
four filler words which appeared once each. 
The order of the words in the list was ran- 
domized according to three predetermined 
randomized orders A, B, and C. These lists 
were counterbalanced in each of the experi- 
mental groups. 

After the training trials and the memory 
test were finished, Ss were taken into an ad- 
jacent room where the tachistoscope was lo- 
cated. All Ss were instructed to look through 
the eyepiece and gaze at a fixation point on the 
tachistoscopic screen. They were told that 
E would sound a buzzer and then flash a word 
on the screen for a brief interval. The E told 
them he would flash the same word until they 
correctly identified it, but they would not be 
informed when their responses were correct. 

Not more than 10 min. nor less than 8 min 
of adaptation trials were given preceding the 
test trials. An ascending method of limits 
The 
A buzzer was sounded 
about 1 sec. before the word was flashed. A 


was used to determine the threshold. 
steps were 4 msec. 


word was flashed in the tachistoscope approxi- 
mately once every 5 sec. until it was correctly 
The 


exposure time of the first of these three con- 


pronounced for three consecutive trials. 


ROBERT L. 


SPRAGUE 


secutively correct recognitions was taken as 
the threshold. 

If S was mispronouncing a word, the trials 
were continued until he had mispronounced 
the word 10 consecutive times. If S then 
could spell the word correctly, the threshold 
was taken at the first one of these 10 trials, 
but if S misspelled the word, the trials were 
continued until he pronounced the word cor- 
rectly three consecutive times. The order of 
presentation of the words on both the adapta- 
tion and test trials was randomized. 

Because it was quite difficult to select a 
starting exposure time which was appropriate 
for the differentially changed levels of recogni- 
tion of Ss from the three groups, it was de- 
cided, on the basis of pilot study data, to set 
the starting exposure time 200 msec. (for 
“pronounce” Ss and “read” Ss) and 100 msec. 
(for “verbal’’ Ss) below the mean threshold 
of the last three adaptation words for each S. 
However, all Ss, except five “‘verbal’’ Ss, had 
the same starting exposure time, i.e., 4 msec. 


RESULTS 


The mean recognition threshold 


was computed for each group at each 
These data are presented 


frequency. 
in Fig. 1. 
The frequency effect is a “within”’ 
treatment groups effect, as the fre- 
quency data are composed of repeated 
measures over the same Ss. How- 
ever, the training effect is a “between” 
effect because it is a comparison of the 
differences between the means of the 


@——e PRONOUNCE GROUP 
o-——0 READ GROUP 
4——4 VERBAL GROUP 





MEAN THRESHOLDS IN MILLISECONDS 


5 
FREQUENCY 


Fic. 1. The effect of differential training 
and frequency of presentation on tachisto- 
scopic recognition thresholds. 
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three treatment groups. Thus it was 
necessary to compute two analyses 
because the error term appropriate to 
test the “‘within” effect is different 
from the error term to test the ‘‘be- 
tween” effect. To test the Frequency 
effect, an analysis of variance was com- 
puted using the data, in milliseconds, 
from the three experimental groups 
for all the frequencies except the 
zero frequencies. The results of this 
test showed the Frequency effect to be 
significant at the .05 level (F = 3.14, 
df = 3, 126), but the interaction of 
Frequency by Training procedure 
failed to be significant at the .05 level 
(F = 1.27, df = 6, 126). One might 
think the interaction would be sig- 
nificant because the ‘“‘pronounce”’ 
curve and “verbal” curve 
Fig. 1, but it must be remembered 
that the zero frequency data were not 
included in the analysis of variance as 
it is included in Fig. 1. 

After it was discovered that the fre- 
quency effect was significant, it was 
decided to test the simple frequency 
effects at each group level, since one 
purpose of the experiment was to as- 
certain the relative effects of differ- 
ential training on 
thresholds. Although it was realized 
that there are arguments against 
testing the simple effects after ob- 
taining a significant over-all effect, 
the simple frequency effects were 
tested by the Multiple Range Test 
(1955). 
Table 1 presents the differences be- 
tween the means of 1 frequency 
words and the means of the 5. ‘0, and 


cross in 


tachistoscopic 


as proposed by Duncan 


20 frequency words at each group 
level. 

None of the mean thresholds in the 
5, 10, and 20 frequency cells of the 
verbal training group, as shown in 
Table 1, differ significantly from the 1 


frequency mean thresholds of that 


TABLE 1 


DIFFERENCES BETWEEN THE MEANS OF THI 
1 FREQUENCY NONSENSE WORDS AND THE 
MEANS OF THE 5, 10, AND 20 FREQUENCY 
NONSENSE Worps IN EACH 
EXPERIMENTAL Group 


Frequencies Compared 
Groups 


1-5 


21.0* 
25.5* 
—4.2 


Pronounce 
Read 
Verbal 


| 
| 


*P = OS. 


same training group. In fact, the 
thresholds of the 5, 10, and 20 fre- 
quency cells are actually increased, as 
shown by the negative differences, as 
a result of further training. This 
finding would indicate that verbal 
training is not effective in lowering 
tachistoscopic thresholds of words. 
But the significant differences ob- 
tained between the means of the 1 
frequency cells and the means of the 
5, 10, and 20 frequency cells in the 
“pronounce” group and “read” group 
support the contention that pronunci- 
ation training and reading training 
are effective in lowering tachisto- 
scopic thresholds. 

An analysis of covariance was com- 
puted to test the significance of the 
differences between the experimental 
groups because there were initial, un- 
the 
groups as shown on the zero frequen- 


controlled differences between 


cies in Fig. 1. The differences between 
the training procedures just barely 
missed significance at the .05 level 
(F = 3.20, df = 2, 41). 
combined with the results obtained in 
Table 1, that 
method and the pronunciation method 


This result, 


indicates the reading 
of training are effective in lowering 
tachistoscopic thresholds, whereas the 
verbal method is not effective. 
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DISCUSSION 


The results of this experiment further 
corroborate the findings of experiments 
in which frequency was experimentally 
manipulated (Solomon, 1951; Solomon & 
Postman, 1952) and the findings of ex- 
periments in which the Thorndike-Lorge 
word counts (DeLucia & Stagner, 1953; 
McGinnies, Comer, & Lacey, 1952; 
Solomon & Howes, 1951) were used to 
control for frequency. In the face of 
the accumulation of evidence in favor of 
the frequency hypothesis, it seems that 
the frequency of prior occurrence is one 
of the more important variables in deter- 
mining tachistoscopic thresholds which 
has been investigated. 

The question might be raised whether 
oral stimulation per se is as effective as 
visual simulation per se. Essentially, 
the point of this question is whether 
other stimulus parameters, such as in- 
tensity or stimulus duration, etc., are 
important in determining the relative 
efficacy of the two methods of presenta- 
tion. If these parameters are important, 
it might be possible to adjust the stimuli 
and equate the visual and oral methods 
on initial tachistoscopic thresholds. Al- 
though this was not done, the visual 
method and oral method were both used 
controlling the crucial frequency variable. 
The results show that oral practice alone 
has no effect on tachistoscopic thresholds. 
As can be ascertained from Fig. 1, the 
curve of the ‘‘verbal’’ group is essentially 
a straight line. Moreover, none of the 
tests of simple effects within the “verbal” 
group are significant as shown in Table 1. 
The experimental data collected are of 
vital importance to the controversy which 
arose between McGinnies (1950) and 
Howes and Solomon (1950) in the spoken- 
written frequency debate. It appears 
as if the spoken-frequency hypothesis is 
no longer tenable. 


SUMMARY 


Forty-five Ss were randomly assigned to 
three experimental groups. These groups 
differed in the treatment they received during 
training: a “pronounce” group read and pro- 
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nounced nonsense words, a “‘read”’ group read 
nonsense words silently to themselves, and a 
“verbal” group pronounced nonsense words 
after hearing E pronounce the word. In a 
training list of 40 nonsense words, the core 
words were inserted at frequencies of 0, 1, 5, 
10, and 20 for a total of 36 words. In addi- 
tion, four filler words occurred once each in 
the list. After the training trials, Ss were 
given 8 to 10 min. of adaptation trials on a 
tachistoscope, then their tachistoscopic rec- 
ognition thresholds on the core words were 
determined. 

An analysis of variance showed the fre- 
quency variable to be significant at the .05 
level. Visual training (‘“‘pronounce” and 
“read’’) significantly lowered thresholds, but 
oral training (‘‘verbal’’) did not reduce 
thresholds significantly. The differences be- 
tween the types of training just missed sig- 
nificance at the .05 level, as determined by an 
analysis of covariance. 
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Stimulus redundancy has _ been 
shown to be an important factor in 
communication (Chapanis, 1954; 
Newman & Gerstman, 1952), in dis- 
crimination of visual forms (Rappa- 
port, 1957), and in simple two-choice 
discrimination learning (Eninger, 
1952; Miller, 1939). It is clear from 
these studies that the type of redun- 
dancy used, and the conditions under 
which it is used, may alter consider- 
ably the ease with which such proc- 
esses take place. Increasing redun- 
dancy does not aid the performance 
studied in all cases. Redundancy is 
often introduced by encoding a mes- 
sage so that there are more stimulus 
features than the theoretical minimum 
necessary to transmit a_ particular 
amount of information. This pro- 
vides an effective means for overcom- 
ing the disruption ordinarily brought 
on by irrelevant information or noise. 
But the saliency of irrelevant informa- 
tion also might be increased by addi- 
tional irrelevant information redun- 
dant with that already present. Such 
an increase should inhibit effective 
use of relevant cues. 

The effects of relevant and irrele- 
vant information upon performance in 
a concept task 


identification have 


1 This research was supported in part by a 
grant from the University of Utah Research 
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Health, Public Health Service. The authors 
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ties of the local offices in the construction and 
duplication of taped programs. 


been described. As the amount of 
(relevant) information necessarily 
used to solve such a problem increased, 
performance decreased exponentially 
(Walker, 1957). Likewise, as the 
amount of irrelevant information 
increased, performance decreased, 
though not so rapidly (Archer, 
Bourne, & Brown, 1955; Bourne, 
1957; Bourne & Pendleton, 1958). 
In this type of task, S was required to 
categorize geometric patterns into 
groups and information of both types 
was quantified in terms of the number 
of two-level dimensions according to 
which these patterns could vary. An 
example of such a dimension is color, 
of which the two levels are red and 
green. In these studies, each dimen- 
sion was independent of all others, 
i.e., its two levels were uncorrelated 
with the levels of any other dimension. 
Therefore, the information each trans- 
mitted was not redundant -with that 
provided by other dimensions. 

The present paper reports two ex- 
periments carried out to investigate 
the role of redundant stimulus infor- 
mation in concept identification. Re- 
dundant relevant information (Exp. I) 
and redundant irrelevant information 
(Exp. II) were systematically varied 
and trials and errors to problem solu- 
tion were used as the dependent vari- 
ables. In order that the information 
provided by two or more dimensions 
be redundant, their levels must be, to 
some degree, correlated. In the pres- 
ent studies, redundancy was complete ; 
the levels of any two redundant di- 
mensions were perfectly correlated. 
Under redundant 


such conditions, 


232 





STIMULUS REDUNDANCY IN CONCEPT IDENTIFICATION 


information could be wholly relevant 
or wholly irrelevant, but not both 
relevant and irrelevant. These stud- 
ies were designed to test hypotheses 
concerning the facilitating effect of 
relevant redundancy and the inhibit- 
ing effect of irrelevant redundancy 
upon concept identification. The hy- 
potheses were generated from a theory 
of concept identification presented 
elsewhere (Bourne & Pendleton, 1958; 
Bourne & Restle, 1959; Walker, 1957), 
which basically is an extension of 
Restle’s (1955, 1957) mathematical 
model for two-choice discrimination 
learning. 


GENERAL PROCEDURE AND 
APPARATUS 


Task.—The task was essentially the same 
as that used in earlier concept identification 
studies (Archer, Bourne, & Brown, 1955; 
Bourne, 1957; Bourne & Pendleton, 1958) 
and is discussed more fully in those reports. 
The S was presented with a series of geometric 
patterns each of which was a combination of 
x relevant and y irrelevant two-level stimulus 
dimensions. To each pattern, S responded 
by pressing one of a number of available keys 
to identify the category to which the pattern 
belonged. The keys corresponded to the 
levels or combinations of levels of the dimen- 
sion(s) relevant to problem solution. A di- 
mension was relevant if it was necessarily 
used in correctly identifying all patterns; a 
dimension was irrelevant if it appeared at 
each of its two levels within the stimulus 
patterns but could not be consistently used to 
classify the patterns. The criterion of prob- 
lem solution was 16 consecutively correct 
identifications. All tasks were self-paced in 
that S was allowed as much time as needed to 
respond to any pattern. 

Apparatus.—In addition to S’s response 
board, which contained four push buttons 
and four signal lamps, there were four major 
units to the apparatus: (a) a Dunning Animatic 
16-mm. filmstrip projector used to present the 
sequence of patterns, (b) an Esterline-Angus 
operations recorder used to record the correct 
response and the button press made by S for 
each pattern, (c) a timing and control unit 
consisting of three cascading delay circuits 
used to determine delay and duration of in- 
formation feedback and over-all trial length, 
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and (d) a Western Union tape transmitter 
with a continuous loop of tape? used to con- 
trol S’s signal lights and to provide appropriate 
information feedback after each response 
Thus, S’s response advanced the film strip 
projector to a blank position, activated the 
timing unit, and was recorded on the Esterline- 
Angus. The timing unit immediately ad- 
vanced the tape transmitter, presenting the 
correct signal light to S, and after 2 sec. 
activated the transmitter again, thus turning 
off the light, and, after a total of 5 sec. had 
elapsed, advanced the projector to the next 
pattern. 


EXPERIMENT | 


The purpose of the first study was to 
investigate the effect of variation in 
degree of redundant relevant informa- 
tion upon the identification of 
cepts. With relevant information 
completely redundant, as in this ex- 
periment, only two-category learning 
can be investigated. 


con- 


Subsequently, 
experiments may explore various de- 
grees of redundancy completeness 
wherein two, perhaps three, sets of 
dimensions are relevant but inde- 
pendent while all dimensions within 
a set are redundant. 


Procedure 
Subjects—The Ss were 180 students in 
elementary psychology courses who were as- 
signed randomly to one of 12 treatment com- 
binations. Each S was presented at the out- 
set with detailed instructions which described 
the nature of a concept, the operation of his 
controls, the meaning of the signal lamps, 
and the criterion of problem solution 
Design.—An incomplete 6 X 3 factorial 
design was used with six levels of redundant 
relevant information (1, 2, 3, 4, 5, and 6 di- 
mensions) and three levels of nonredundant 
irrelevant information (1, 3, and 5 
sions) 


dimen- 
The design was incomplete since not 
all levels of relevant information were repre- 
sented at all degrees of irrelevant information 
Allowing the first digit to represent the num- 
ber of relevant dimensions and the second to 
represent irrelevant, the 12 conditions ex- 
plored were 1-1, 2-1, 3-1, 4-1, 5-1, 6-1, 1-3, 
2-3, 3-3, 4—3, 1-5, and 2-5. There were two 
reasons for the failure to complete the design: 
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TABLE 1 


TWELVE 


TREATMENT COMBINATIONS WITH DIMENSIONS UsED AS RELEVANT 


(REDUNDANT) AND IRRELEVANT Gosmameemasre) INDICATED, Exp. I * 








Number of 
Irrelevant 


Number of Relevant Dimensions 








Dimeasions 





N,S,0- 





“Rel. iz Cc 
| N,S,0,F,V | N,S,0,F,V 


| | 


Irrel. 


N,S,O 





C,H,F,V,S | C,H,F,V,S,O 
N N 





| 
| 
| 








® Letters in cells of this matrix indicate my, | used: color (C), horizontal position (H), form (F), vertical 


position (V), size (S), orientation (O), and number (N 


(a) it is, in practice, virtually impossible to 
vary more than seven two-level stimulus di- 
mensions in the construction of these prob- 
lems and (b) pretests indicated that possible 
stimulus dimensions, other than the seven 
used, are less available, or not equally strong. 
However, instead of a simple and complete 
4 X 2 factorial design with only four levels 
of relevance and two of irrelevance, the 
extended ranges were included. Table 1 
shows the dimensions used as relevant and 
irrelevant in each of the 12 treatment com- 
binations. Films were constructed in such a 
way that when any dimension was redundant 
with any other the levels within each were 
perfectly correlated, e.g., if color and form 
were redundant, squares were always green, 
triangles red. For any two nonredundant 
dimensions, the four possible combinations of 
their levels (e.g., red squares, red triangles, 
green squares, and green triangles) appeared 
equally often within the sequence of patterns. 


Results 


Since instructions stressed accuracy, 
number of errors was the main depend- 
ent variable. Figure 1 is a graphical 
presentation of the mean number of 
errors for each condition of relevant 
and irrelevant information. Note- 
worthy is the increasing slope of 
the performance-relevant information 
function as number of irrelevant di- 
mensions is increased. The structure 


of the design necessitated several 
statistical analyses. The significance 
of the relevant information source was 
determined by analysis of variance at 
each level of irrelevant information. 
With one irrelevant dimension, num- 
ber of redundant relevant dimensions 
was not statistically significant (F = 
2.17, 5 and 84 df, P > .05), but with 
three and five irrelevant dimensions, 
this source was significant beyond the 
.01 level (F = 4.58, 3 and 56 df; F 

15.92, 1 and 28 df, respectively) 
Trend analysis of the functions for 
one and three irrelevant dimensions 
indicated that each had a significant 
linear component only (F = 9.37, 1 
and 84 df, P < .01; F = 13.53, 1 and 


TABLE 2 


ANALYSIS OF VARIANCE OF Errors, Exp. | 





Source of Variance 





Number of relevant di- | 
mensions (R) 

Number of irrelevant di- 
mensions (1) 

R XI 

Residual 


17.10* 
29.48* 


5.06* 
| (11.67) 


Note. yeaa in TSA is error mean square. 
*P < O1. 
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5 Irrel. Dim. 


4 
is 3 Irrel. Dim. 


8) 


MEAN NUMBER OF ERRORS 


i Irrel. Dim. 





| } | 
2 3 6 
NUMBER OF RELEVANT 
DIMENSIONS 





Fic. 1. Mean number of errors to solu- 
tion as a joint function of number of redund- 
ant relevant and nonredundant irrelevant 
stimulus dimensions. Each plotted point 
represents the data from 15 Ss. 


56 df, P< .01, respectively). A 
fourth analysis was performed on the 
six groups with one or two relevant 
dimensions and one, three, or five 
irrelevant dimensions. The results 
of this analysis are shown in Table 2. 
The significance of the relevant by 
irrelevant information interaction 
verified the apparent change in slope 
of the functions in Fig. 1. Essenti- 
ally the same results were obtained 
with trials to solution as the perform- 
ance measure. 


EXPERIMENT II 


The second experiment was designed 
to assess the effect of redundancy of 
irrelevant information upon concept 
identification. As in the first study, 
the levels of redundant dimensions 
were perfectly correlated. Thus, the 
addition of a redundant irrelevant di- 
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mension to those already included in 
the stimuli did not increase the num- 
ber of different patterns but only the 
complexity of the figures to be cate- 
gorized. 


Procedure 


Subjects—The Ss were 100 students who 
were assigned randomly to 20 groups and 
tested individually. None had served earlier 
in Exp. I. General instructions, the same as 
those used in Exp. I, were given at the outset. 

Design.—A 5 X 2 X 2 factorial design was 
used with five levels of redundant irrelevant 
information (1, 2, 3, 4, and 5 dimensions), 
two levels of nonredundant relevant informa- 
tion (1 and 2 dimensions), and two different 
problems (defined in terms of the dimensions 
relevant to solution). The number of inde- 
pendent relevant dimensions fixed the number 
of categories into which the patterns could be 
sorted. Thus, with one relevant dimension, 
the problem was two-choice, with two relevant 


30 


2 Rel. Dim. 


fo) 
— 


MEAN NUMBER OF ERRORS 








! 
1 2 5 
NUMBER OF IRRELEVANT DIMENSIONS 


Fic. 2. Mean number of errors to solu- 
tion as a joint function of number of non- 
redundant relevant and redundant irrelevant 
stimulus dimensions. Each plotted 
represents the data from 10 Ss. 


point 
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dimensions, four-choice. The two different 
problems were used to offset the possibility of 
one S passing on the solution to others. For 
the two-choice groups, the relevant dimen- 
sion was color or number; for the four-choice, 
the relevant dimensions were color and form 
or number and size. The filmstrips were con- 
structed in the same manner as those used in 
Exp. I. 


Results 


Asin Exp. I, records indicated simi- 
lar conclusions using either errors or 
trials as the performance measure. 
Figure 2 is a plot of mean number of 
errors as a joint function of nonre- 
dundant relevant and redundant ir- 
relevant information. It appears, 
from inspection, that amount of ir- 
relevant information had a more 
marked effect on four-choice learning 
(two relevant dimensions) than on 
two-choice (one relevant). However, 
an analysis of variance, summarized 
in Table 3, provided no statistical 
support for this conclusion, since the 
relevant by irrelevant information 
interaction failed to reach an accept- 
able level of significance. Only two 
sources reached significance at the .01 
level: number of relevant and number 
of irrelevant dimensions. Only the 
linear term of an orthogonal poly- 
nomial analysis applied to the per- 
formance-irrelevant dimensions func- 


TABLE 3 


ANALYSIS OF VARIANCE OF Errors, Exp. II 


Source of Variance 


Number of relevant di- 
mensions (R) 

Number of irrelevant di- 
mensions (1) 

Problems (P) 

R xl 

RX P 

‘xr 

axa XP 

Residual 


Note. 


Number in parentheses is error mean square. 
*P< OL. 
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tion was statistically significant at the 
.O1 level (F = 13.42, 1 and 80 df). 


DISCUSSION 


Some of the effects of stimulus redun- 
dancy observed in these experiments were 
to be expected on the basis of a theory of 
concept identification (Bourne & Pendle- 
ton, 1958; Bourne & Restle, 1959; 
Walker, 1957) and on the basis of earlier 
experiments (Chapanis, 1954; Eninger, 
1952; Miller, 1939; Newman & Gerstman 
1952; Rappaport, 1957). The theory 
assumes that rate of learning, and, there 
fore, total number of errors, depends upon 
the amount of relevant and irrelevant in- 
formation present in the stimuli to be iden- 
tified. In fact, a learning rate param- 
eter, 6, is assumed equal to the ratio of 
relevant to total (relevant plus irrelevant) 
cues available to S. Increasing the 
number of redundant relevant dimen 
sions theoretically and empirically in- 
creases the number of relevant cues and 
reduces the mean number of errors made 
in learning to identify a set of stimuli. 
On the other hand, increasing the num- 
ber of redundant irrelevant dimensions 
reduces 8, increases the saliency of ir- 
relevant cues, and inhibits performance 
to a significant degree. 

Rappaport (1957), in a somewhat dif- 
ferent task, showed that relevant stimu- 
lus redundancy facilitated rapid dis- 
crimination of visual forms in the pres- 
ence of background noise (irrelevant 
information) but had a slightly inhibiting 
effect in a noise-free situation. This 
result supports the significance of the 
relevant by irrelevant information inter- 
action observed in Exp. I. Relevant 
redundancy became increasingly more 
effective as additional irrelevant informa- 
tion was introduced into the patterns. 
The design of Exp. I did not include 
conditions of no irrelevant information. 
It is entirely possible that, had such con- 
ditions been explored, redundancy would 
As Rappa- 
port suggested, the interference of re 
dundancy in a noise-free situation may be 


interfere with performance. 


attributable to S’s attempt to discrimin- 
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ate and use all (redundant) details of the 
pattern for his identification. 

Bricker (1955) reported evidence 
which appears to contradict the present 
findings. In his experiment, S’s task 
was to classify sets of binary light pat- 
terns into 2, 4, or 8 groups. The failure 
of additional and redundant 
information to facilitate performance 
probably can be attributed to the fact 
that, even in the 2-choice situation, no 
single binary system could be used for 
correct classification; in all cases, S had 
to classify on the basis of the illumination 
of three of six lights, i.e., the arrangement 
of three binary systems. In the present 
by comparison, S could 
classify patterns on the basis of any one 
of the number of 
systems) which 


relevant 


experiment, 
dimensions (binary 
were relevant and 
redundant. 

A thorough search of the literature re- 
vealed no studies related to the effects of 
redundant irrelevant information. It is 
clear, however, from a comparison of the 
present data to those of earlier concept 
identification studies that redundant ir- 
relevant information interfered less with 
performance than did a comparable 
degree of nonredundant information. 
Table 4 presents the mean errors made 
in Exp. II in the 2- 
lems at 1, 3, and 5 irrelevant dimensions. 


and 4-choice prob 


rABLE 4 


MEAN NUMBER OF ERRORS AS A FUNCTION OF 
NUMBER OF REDUNDANT AND NONREDUN- 
DANT IRRELEVANT DIMENSIONS IN 2 
AND 4- CHorce CONCEPT 
IDENTIFICATION 


Number of Irrelevant 
Dimensions 


Task 


Concept 


Redundant 
4-choice 
Nonred.* 


Redundant 
2-choice 
Nonred.> 


* Data from Bourne and Pendleton (1958) 
> Data from Exp 
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For comparison, the mean errors in cases 
where the amount of irrelevant 
information was nonredundant are tabu 
lated. Note that increasing either type 
of irrelevant information yields essenti- 
ally linear decrements in performance. 
There difference in the 
number of errors which probably can be 
attributed to the following effects. 
When independent irrelevant dimensions 


same 


is a absolute 


are added to stimuli, they progressively 
increase the number of possible hypothe- 
for solution (Archer et 1955), 
the addition of irrelevant di- 
mensions perfectly correlated with those 
already present only increases thé sali- 
ency of the hypotheses available. The 
comparisons of Table 4 support the con- 
clusion that number, rather than con 


ses al., 


whereas, 


spicuousness of irrelevant hypotheses, is 


the more important factor in concept 
identification. 


SUMMARY 


Iwo experiments were conducted to test 
upon the 
rhe task in both 
experiments was to learn the correct method 


the effects of stimulus redundancy 
identification of concepts 
of classifying visually presented geometri 
In Exp. I, 180 Ss served in in- 
complete factorial design with six levels of 
redundant relevant information (1, 2, 3, 4, 5, 
and 6 stimulus dimensions) and three degrees 
of (1, 
3, and 5 stimulus dimensions 

In Exp. I], 100 Ss served ina 5 xk 2 x 2 
factorial with five degrees of redundant ir- 
relevant information (1, 2, 3, 4, and 5 stimu- 
lus dimensions), two degrees of nonredundnat 
relevant information (1 and 
and two different problems 


figures. an 


nonredundant irrelevant information 


2 dimensions), 


The major findings were: (a) Increases in 
redundant improved 
performance at all levels of irrelevant informa- 
tion. (b) The facilitating effect of redundant 
relevant information became more apparant 


relevant information 


as amount of irrelevant information increased 
Redundant information inter- 
fered with both levels of 
relevant information; however, it had a less 
inhibiting effect than comparable degrees of 
nonredundant information 

The results compatible with the 
effects of redundancy observed in other tasks 
and were interpreted within the theoretical 
framework of Restle (1955, 1957) 


(c) irrelevant 


performance at 


irrelevant 
were 
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AN ADAPTATION-LEVEL ANALYSIS OF ORDINAL 
EFFECTS IN JUDGMENT 


ALLEN PARDUCCI! 


University of California, Los Angeles 


All judgments are comparative in 
the sense that they are determined by 
the relationships between stimulus 
events. Sometimes the _ relevant 
stimuli are specified in the instructions 
for judgment, as when the task is to 
judge which of two lines is longer. 
But even with such specification, 
background, context, or other un- 
judged stimuli may affect the judg- 
ment (e.g., Brunswik & Herma, 1951; 
Fernberger, 1920; Hollingworth, 1910; 
Kohler & Wallach, 1944; Wallach, 
1948). 

When the task is not to compare two 
stimuli but rather to judge each one in 
absolute terms (e.g., “‘short”’ or ‘‘long”’ 
rather than “shorter” or “longer’’), 
the influence of background stimuli 
becomes much more prominent. 
Thus the study of absolute judgments 
has been largely directed toward the 
description of the class of stimuli 
which determines such judgments 
(Helson, 1947; Johnson, 1955). 
Special attention is paid to the very 
context effects which one usually tries 
to minimize or balance out in the 
traditional study of comparative judg- 
ments. 

The which stimuli are 
presented is one of the features of the 
stimulus context which influences both 
absolute (Parducci: 1954, 1956; Ver- 
planck & Cotton, 1955) and compara- 
tive judgments 1920; 
Stevens, 1957). With single-stimulus 


order in 


(Fernberger, 


1 The author wishes to express his appreci- 
ation to Harry Vickers, Lee Kabasakalian, 
Pat Brewer, and Barbara Timoner for assist- 
ance in the collection and analysis of the data 
for these experiments. 


and also with constant-stimulus meth- 
ods, the variable stimuli are presented 
in randomized or counterbalanced 
order to minimize such effects; and 
with the method of limits, both as- 
cending and descending series are 
used, the computed threshold being 
based on an average of results ob- 
tained from both procedures (Wood- 
worth & Schlosberg, 1954). How- 
ever, the deliberate establishment of 
experimental conditions which mag- 
nify the ordinal effects may itself be 
attempted in order to learn more 
about them. The conditions which 
determine their magnitude should be 
deducible from more general princi- 
ples of judgment; and, insofar as 
ordinal effects cannot be explained by 
current judgment theory, research 
defining the conditions which in- 
fluence them should lead to improved 
theories of judgment. 

The present study investigates 
ordinal effects as they relate to Hel- 
son’s theory of adaptation level, the 
theory which attempts most specifi- 
cally todeal with the effects of stimulus 
context upon judgment (Helson, 1947 ; 
Michels & Helson: 1949, 1954). Ac- 
cording to Helson’s theory, the per- 
ceptual judgment of any stimulus 
depends upon the ratio of the physical 
value of that stimulus and the physi- 
cal value of S’s current adaptation 
level (AL). To the degree that the 
judged stimulus is greater than the 
momentary value of AL, the stimulus 
is judged “large” (or “larger’’ than 
the standard to which it is being com- 
pared). The theory defines AL as 
the physical value of the stimulus 
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which would be judged ‘‘neutral’”’ 
(or “equal’’ to the standard) and 
postulates that this value is a weighted 
geometric mean of the various stimuli 
to which S has been exposed. While 
no general attempt has thus far been 
made to deduce ordinal effects from the 
theory,’ certain clear predictions con- 
cerning the effects of different orders 
of exposure can be made since the 
order of exposure determines which 
stimuli will have been exposed at any 
ordinal position in the series. 

The present research was designed 
to study such ordinal effects where 
they would presumably be most pro- 
nounced, with absolute judg- 
ments. However, the usual method 
of single stimuli was modified to per- 
mit greater specification of the class of 
relevant stimuli than is usually possi- 
ble with absolute judgments. This 
procedure also permitted independent 
manipulation of the orders in which 
the stimuli were first presented and 
later judged. 

The theory of adaptation level 
yields the prediction that if a series of 
stimuli were presented for judgment 
in order of increasing magnitude, the 
respective stimuli in the series would 
tend to elicit higher categories of 
judgment than if the same series were 
presented in order of decreasing 
magnitude the AL is pulled 
toward the beginning of an ordered 
stimulus series). This prediction fol- 
lows from the fact that for any 
stimulus value in the ascending series, 
the mean of all the stimuli which pre- 
ceded it is lower than the mean of the 
stimuli which would have preceded it 
if the series had been presented in 
descending order. 


i.e., 


(i.e., 


In common sense 
terms, S compares each stimulus with 


* Adaptation-level theory has, however, 
been applied to one special kind of ordinal 
phenomenon, time-order effects with com- 
parative judgments ( Michels & Helson, 1954). 
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the others he has seen. Since each 
successive stimulus is larger than the 
others he has seen when the order is 
ascending, he judges it with a “larger” 
category then he would use if it were 
smaller than the others (i.e., in a de- 
scending series). 

But would this ordinal effect still 
occur if S were shown all the stimuli 


before he commenced his judgments? 
According to adaptation-level theory, 
his judgments would now be independ- 
ent of the order in which he first in- 
spected the stimuli providing that he 
was equally exposed to each of them. 
However, the theory still predicts 
that his AL would be pulled toward the 
values of the stimuli he judges first 
since he would have had 


more ex- 


posure to these than to the stimuli he 
has not yet judged but only inspected. 


EXPERIMENT | 
Method 


Stimuli.—Two sets of wooden sticks, vary- 
ing in length, were the stimuli for Exp. | and 
also for the subsequent variations. The 43 
sticks in each set were selected so as to present 
skewed distributions of lengths. The re- 
spective sticks were 10 mm. in width, 5 mm. 
thick, and their lengths varied from 32 to 
298 mm. in both sets. In the positively- 
skewed distribution, the increments in length 
were each approximately 4.5 mm. for the 35 
shortest sticks and 14.5 mm. for the remaining 
8 sticks, yielding a set with a geometric mean 
of 115 mm. . For the negatively-skewed dis- 
tribution, the same two end-sticks were used; 
but the magnitudes of the successive steps 
were reversed so that the were 
approximately 14.5 mm. for the 8 shortest 
sticks and 4.5 mm. for the remaining 35 
sticks, giving a geometric mean of 178 mm 
Under all experimental conditions, in both 
this and in the following experiments, the 
sticks were placed horizontally before S on a 
22 X 28-in. white posterboard ruled with 
horizontal lines 10 mm. apart. A single 
vertical line bisected the posterboard, and 
the sticks were always placed so that their 
left ends coincided with this line at the be- 
ginning of the judgment series 

Subjects.—The Ss were 160 college students 
in introductory psychology. 


increases 


Each S was run 
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individually on the basis of random assign- 
ment to one of the eight experimental condi- 
tions, with 20 Ss under each condition. 

Procedure.—Each S was seated before a 
table which was covered by the posterboard. 
Instructions were read by E who sat on the 
opposite side of the table. These stated that 
S would be shown a series of sticks and that 
after all of them had been laid out before him, 
he would judge each one in terms of five 
alternative categories: ‘‘very long,” “long,” 
“medium,” “short,’”’ or “very short’’—in 
accordance with how long or short each stick 
appeared in comparison with all the other 
sticks. The respective sticks were then laid 
out to his right, one at a time, at a rate of 
approximately 3 sec. per stick. The position 
of each stick was fixed, regardless of the ex- 
perimental condition, with the longest in the 
position furthest from S and the other 42 
approaching his side of the table in order of 
decreasing length. When all 43 sticks were in 
place, S was instructed to move them one at a 
time to the corresponding position on the 
other side of the posterboard, announcing his 
judgment of each stick as he moved it. 

Two of the independent variables involved 
orders of manipulation. The order of pres- 
entation was determined by whether E pre- 
sented the sticks in the short-to-long (SL) or 
long-to-short (LS) orders. The order of 
judgment was determined by whether S had 
been instructed to move the sticks (announc- 
ing his judgments) in the short-to-long (SL) 
or long-to-short (LS) orders. These two 
manipulated for both the 
negatively-skewed distribu- 
tions of sticks (the third independent vari- 
able) in a 2 X 2 X 2 factorial design. The 
deductions from adaptation-level theory, as 
outlined that ALs should be 
higher for the LS than for the SL order of 
judgment but that there should be no differ- 
ences associated with the order of presenta- 
tion. Apart from these ordinal effects, the 
theory predicts lower ALs for the positively- 
than for the negatively-skewed distribution 
since the mean of the 
lengths is lower in the 
distribution 


variables were 


positively- and 


above, are 


stimulus 
positively-skewed 


geometri 


Results and Discussion 


An AL was computed for each S by 
calculating the mean of the two limens 
for his after 
these limens had been determined by 
the the 


“medium” category 


bisecting interval between 


TABLE 1 


MEAN ADAPTATION LEVELS: Exp. I 


Presentation Order 
Order of | SL LS 
Judgment 


Mean SD Mean 


Positively-Skewed Distribution 
SL 113. 
LS | 136. 
' 


Negatively-Skewed Distribution 


SL |} 174.9 
LS 205.2 


17.7 
16.0 


longest stick in one category and the 
shortest in the adjoining category. 
The group means and SDs are pre- 
sented in Table 1. All three 
pendent variables have highly sig- 
nificant effects upon AL (see Table 
2), together accounting for 84% of 
the total variance 


inde- 


an extraordinarily 
high percentage for this type of re- 
search. 

Inspection of the ALs in 
Table 1 that these central 
measures of Ss’ judgment scales are 
shifted toward the values of the 
stimuli which are judged first. This 
was the prediction from adaptation- 
level theory. However, the theory 
did not predict that the same kind of 
shift would be associated with the 
order of presentation. It should be 
noted that these ordinal effects, pre- 
dicted and not predicted, occur both 
when the general AL is below 
when it is above 165 mm., the 


mean 
reveals 


and 
mid- 
point of the range for both distribu- 
tions. A tabulation was also made of 
the ordinal effects with respect to each 


of the other four category midpoints 
This indicated that the ordinal effects 


shown in Table 1 are manifest 


throughout the judgment scale, the 
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midpoints of each of the categories 
shifting toward the values which are 
either presented or judged first. The 
sum of each S’s five category mid- 
points was used as the raw score for a 
second analysis of variance. Since 
the results of this analysis were ex- 
tremely similar to those reported in 
Table 2 (with no shifts in the re- 


spective probability levels), only the 
AL analysis is presented. 


Could the results of this experiment have 
been an artifact of the measure of AL? With 
skewed distributions, the midpoint of the 
“medium” category could be systematically 
affected by changes in the width of this cate- 
gory. To insure that this was not the case, 
an examination was made of the actual rela- 
tionship between the category widths and the 
ALs. For both distributions, the relation- 
ship was such as to actually minimize the 
obtained effects of order; i.e., increase in 
width was paralleled by increase in AL for 
the positively-skewed distribution and by 
decrease in AL for the negatively-skewed dis- 
tribution. The data for the groups in which 
there was some tendency for the “‘medium” 
category to be applied to a skewed portion of 
the stimulus distribution were reanalyzed, 
using the median rather than the midrange of 
the ‘‘medium”’ stimuli as the measure of AL. 
The ordinal effects were just as marked, in- 
dicating that they were not an artifact of the 
particular measure selected. 

As a check on whether these effects occur 
only for skewed distributions, two additional 
groups of 20 Ss each were exposed to a new, 
rectangular distribution of sticks (i.e., with 


TABLE 2 


ANALYSIS OF VARIANCE FOR ALs: Exp. I 


Source 
Distribution 
ness (D) 

Order of judgment | 
(J) | 24,942.5 
Order of presenta- 


skew- | 
{1} 158,401.1 | 715.8* 


112.7* 


18.5* 


Within 


*P < 001. 
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the 43 sticks progressing in equal steps from 
32 mm. to 298 mm.). The stimuli were pre- 
sented to both groups in the SL order; one 
judged them in the SL order, the other in the 
LS order. The respective ALs, calculated as 
before, were 143 mm. and 175.8 mm., re- 
spectively. The difference associated with 
the order of judgment is 32.8 mm. (with SDs 
of 13.2 and 16.8, ¢ = 6.68, P < .001), 
greater than the corresponding differences for 
either of the skewed distributions. Thus 
ordinal effects are obtained even when the 
stimuli differ by equal amounts throughout 
the range. 


The unexpected, but highly sig- 
nificant, effects of the order of pres- 
entation suggest that this variable 
may not have been entirely isolated 
from the order of judgment; i.e., Ss 
may have been assigning the judg- 
ment categories to the stimuli during 
the presentation period even though 
they were not permitted to announce 
their judgments during this period. 
Insofar as these covert judgments 
become attached to their respective 
stimuli, the order of presentation 
would affect the subsequent overt 
judgments in the obtained direction. 
An attempt was made to test this 
interpretation in Exp. II by varying 
the order of presentation without 
informing Ss that they would sub- 
sequently be asked to judge the 
stimuli: 


EXPERIMENT I] 
Method 


The two orders of presentation were util- 
ized again. Two new groups, of 25 Ss each, 
were exposed to the positively-skewed dis- 
tribution of sticks and instructed to make their 
judgments in the LS order. To reduce the 
likelihood of covert judgments during the 
exposure period, an incidental exposure tech- 
nique was used. The preliminary instructions 
stated only that this was an experiment in 
perception and that S should pay close 
attention to what he would be shown. The 
instructions for judgment were not given 
until all 43 sticks had been placed on the 
table. 


Interpolated frequency estimates.—In order 
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to study how S’s knowledge of the distribu- 
tion of stimulus values is affected by the order 
of presentation, two additional groups, of 38 
Ss each, were run under the incidental-ex- 
posure conditions just described; however, 
before being given the instructions for judg- 
ment, Ss in these two new groups were asked 
to estimate the frequencies with which lengths 
from different parts of the range had been 
presented. After the sticks were all in place 
(having been presented in the SL order to one 
group and in the LS order to the other), the 
sticks were covered. Six new sticks, 30, 84, 
138, 192, 246, and 300 mm. in length, re- 
spectively, were then placed at right angles 
to the previous position of presentation. The 
instructions stated that these six sticks 
bracketed the range of stimuli previously 
presented and that the task was to estimate 
how many of the original 43 sticks fell be- 
tween each of the successive pairs of the six 
test sticks. Pencil and paper were provided 
and each S was instructed to start his esti- 
mates with the longest pair and work through 
to the shortest (although changes in the esti- 
mates were permitted at any time in the 
process). After these frequency estimates 
had been made and E had ascertained that 
the five estimates added up to 43, the original 
series was uncovered. The usual instructions 
for judgment were then read, and S proceeded 
with his judgments in the LS order. 

Order of removal.—Since each stick in the 
previous variations was left before S from the 
time of initial presentation until the end of the 
experiment, the sticks presented first were also 
exposed longest. Insofar as the effect of each 
stimulus upon AL increases with increased 
duration of exposure (AL shifting toward the 
values presented longest just as it shifts 
toward those presented most frequently, cf. 
Parducci & Brookshire, 1956), the observed 
direction of the effects of the order of pres- 
entation may be deduced from adaptation- 
level theory without recourse to assumptions 
about covert judgments. But this interpre- 
tation of the theory also yields the prediction 
that if all of the stimuli were first exposed 
simultaneously and were then removed one 
at a time, the AL would shift toward the 
stimuli which were removed last (and thus 
exposed longest). To test this interpretation, 
two new groups, of 20 Ss each, were first read 
the incidental instructions, after which E lifted 
a cover, exposing all 43 sticks at once. He 
then removed the sticks, successively, in 
either the SL or LS order, keeping the total 
exposure period approximately equal to that 
of each of the previous conditions. Finally, 
the entire set was replaced before S (by lifting 


TABLE 3 


MEAN ADAPTATION LEVELS FOLLOWING 
INCIDENTAL Exposure with LS OrDER 
OF JUDGMENT: Exp. II 


Order of Presentation 
Condition SL LS 


Mean Mean SD 


No freq. estimates 

After freq. esti- 
mates 

Order of removal 


148.9 | 11.0 


onto the table a hidden, posterboard back- 
ground on which the sticks had been stored 
in their usual position), and the regular 
judgment instructions were read. Again, 
both groups made their judgments in the LS 
order. 


Results and Discussion 


Comparison of the new ALs, pre- 
sented in Table 3, with the ALs for 
the corresponding groups from Exp. I 
indicates that strong ordinal effects 
are obtained under incidental condi- 
tions but that they are not found in 
association with the order of stimulus 
removal. An analysis of variance, 
based on the ALs for the four inci- 
dental groups which did not make fre- 
quency estimates, indicated a signifi- 
cant interaction between the effects of 
the order of exposure durations and 
the effects of whether the exposure 
involved addition or removal (df = 
1 and 76, F = 7.57, P < .01). This 
indicates that the effects of the order 
of presentation are not based entirely 
on differences in duration of exposure. 

Interpretation of the obtained ordi- 
nal effects hinges on the unmeasured 
effects of the “incidental”’ instructions 
upon the tendency to make covert 
judgments. Although complete ab- 
sence of ordinal effects would have 
fitted nicely with an interpretation 
based upon covert judgment, the 
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obtained results do not obviate such 
an interpretation since Ss may still 
have made unvocalized judgments of 
length during the presentation period. 
It is unlikely that many Ss would 
have made these covert judgments in 
terms of five-category scales, but any 
scale of length an S used would pre- 
sumably have shown some transfer 
effects upon the one elicited by the 
subsequent instructions for judgment. 
The complete ordinal 
effects for the removal groups is con- 
sistent with this interpretation; for 
if Ss tend to make their judgments 
upon first exposure to the stimuli, 
both removal groups would have 
made the same covert judgments be- 
fore E began to remove the sticks. 
Analogy with other experiments in 
which the stimulus range was reduced 


absence of 


during the presentation series (Par- 
ducci, 1956) suggests that there should 
be little shift in the judgment scale 
during a removal series. 

Interpolated frequency estimates. 
The mean estimates for each of the 
successive pairs of the six test sticks 
presented in Table 4. While 
both groups overestimated the num- 
ber of longer sticks, the magnitude of 
this error was greater for the SL 
group (df = 74, ¢ = 2.24, P < .05; 
each score being the difference between 
the sum of S’s estimates for the two 


are 


“‘shortest”’ intervals of lengths minus 
the corresponding sum for the two 


TABLE 4 
MEAN ESTIMATES OF STIMULUS 
FREQUENCIES: Exp. I] 


Successive Test Intervals 


| 30-84 | 84-138 | 138-192 | 192-246 | 246 


No.* 13.0 10.0 


4.0 
SL. | 9. 9.9 9.4 8.2 
i t7 : 10.9 9.9 6.9 


* Actual number of stimuli within each interval. 
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“longest’’ ones). Thus it might ap- 
pear that the two groups had acquired 
different information about the char- 
acter of the stimulus distribution even 
before the overt judgments of length 
were made, the LS group believing it 
to be more positively skewed. But 
the whole weight of adaptation-level 
theory (supported by Exp. I and also 
by the results of previous research, 
cf. Helson, 1947; Johnson, 1955; 
Parducci, 1956) argues that the more 
positively skewed the distribution, the 
lower the AL. The fact that the LS 
group’s AL was significantly higher 
thus seems to indicate, instead, that 
such appraisals of the stimulus dis- 
tribution may be only incidental to 
the judgment process. Thus, if more 
of the stimuli were covertly judged 
“long”’ during the SL than during the 
LS presentation, Ss might have been 
expected to guess that more of the 
stimuli had been within the ‘“‘longer”’ 
test intervals following the SL than 
following the LS presentation series. 


EXPERIMENT III 
Method 


The variations introduced in Exp. III were 
designed to test an alternative or supplement 
to the adaptation-level interpretation of the 
effects of the order of judgment. This alter- 
native assumes that S has a tendency to 
switch judgment whenever the 
stimulus to be judged is discriminably differ- 
ent from the stimulus judged just before it 
But since S was permitted only five judgment 
categories with which to judge the 43 clearly 
discriminable stimuli in the preceding experi- 
ments, his category limens and AL were 
pulled toward the initial stimuli. In Exp. 
III, the following two methods were employed 
to minimize the effects of this hypothetical 
switching tendency: (a) the use of more judg- 
ment categories than in the previous experi- 
ments, and (6) a reduction in the number of 
stimuli, without changing either the number 
of categories or any other relevant property 
of the stimulus distribution. 

Four new groups were exposed to the 
negatively-skewed distribution of Exp. I, 
using the SL order of presentation. However, 


categories 
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rABLE 5 


MEAN ADAPTATION LEVELS FOR DIFFERENT 
NUMBERS OF STIMULI AND JUDGMENT 
CATEGORIES WITH SL ORDER OF 
PRESENTATION: Exp. III 


Order of Judgment 
Condition SL LS 


Mean Mean 
174.9 
191.2 
199.5 


Exp. I 
15 sticks 
10 categories 


205.2 
204.5 
199.2 


two of these groups were presented with only 
a subsample, composed of every third stick 
from the old distribution This reduced the 
total number of stimuli from 43 to 15; but it 
did not change the range, and it had almost 
no effect on either the arithmetic or geometric 
means of the distribution rhe other two 
groups were exposed to all 43 sticks under 
instructions to use 10 rather than 5 judgment 
categories These 
nated by the 
rather than by 


categories were desig- 
through ‘‘10,” 
verbal labels, because of the 
difficulty of getting Ss to remember so many 
labels. 


numerals ‘'1” 


For each of the new conditions, 15- 
stick or 10-category, one new group of 20 Ss 
made their judgments in the SL order while a 
second group made them in the LS order 
In all other respects, the procedure was identi- 
cal to that used in Exp. I 


Since number of 


either increase in the 
categories or decrease in the number of stimuli 
reduces the average number of stimuli which 
must be in luded in each « ategory, the ordinal 
effects should be reduced insofar as they de- 
pend on S’s tendency 
when a discriminably 
from the stimulus which preceded it. How- 
ever, this prediction does not follow from 
adaptation-level theory since AL is supposed 
to be independent of the number of categories 
(Michels & Helson, 1949) and since reducing 
the number of stimuli did not affect the geo- 
metric mean of the stimulus distribution 


to switch categories 


stimulus is different 


Results 


The new ALs are presented in 


Table 5 (for the 10-category groups, 
AL was defined as the limen between 


and ‘‘6”). The differ- 


4 ec 
categories 0 


with the order of 
judgment are significantly smaller for 
both of the new conditions than for 
the two corresponding groups in 
Exp. I, i.e., for the two groups pre- 
sented the negatively-skewed distri- 
bution in the SL order. Separate 
analyses of variance, comparing each 
pair of the new groups with the refer- 
ence pair from Exp. I, indicated that 
the interaction was in each case statis- 
tically significant (F = 16.59, P < 
.001 for the increase in categories; F 
= 5.34, P < 
stimuli; df = 


ses). 


ences associated 


.O5 for the reduction in 
1 and 76 in both analy- 


DISCUSSION 


The results, considered all together, 
suggest that adaptation-level theory 
must be modified if it is to handle the 
pronounced ordinal effects manipulated 
in these three experiments. Specifically, 
the theory does not account for either the 
effects of the order of presentation found 
in Exp. I and II or the reduction of the 
effects of order of judgment found in 
Exp. Ill. These suggest the 
importance of further research in which 
conditions 


findings 
affecting Ss’ response dis 
positions are systematically manipulated 
The assumption of the crucial role played 
by Ss’ covert judgments demands par 
ticular study. It assumed that 
these judgments, perhaps basic percep 
tual 


was 
responses even when 
there have been no instructions for judg 
ment, tended to become attached to the 
respective stimuli during the presentation 
series. 


occurring 


This suggests the possibility of 
a general lag in the judgment process. 
To what degree is there a resistance to 
changing the judgmental 
ready made to a given 
conditions affect 
Presumably, the 
more strongly 


response al 
stimulus, and 
resistance ? 
should be 
under conditions 
(like those of the present experiments 
which permit easy identification of the 
respective stimuli 


what such 
resjx mses 


fixed 


than in situations 
where the stimuli are not always present. 
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The results of Exp. III, which indicated 
complete elimination of the ordinal effects 
when the number of categories was in- 
creased, are particularly difficult to in- 
corporate into a simple theory of adapta- 
tion level. This experiment demon- 
strates that AL cannot be predicted solely 
from a knowledge of the stimulus values. 
It thus appears that a general theory of 
judgment must deal with the relationship 
between the number of judgment cate- 
gories and the number of stimuli if it is 
to successfully predict even the midpoint 
of the judgment scale. The predictions 
for Exp. III were made using the special 
assumption that Ss tend to switch cate- 
gories whenever the new stimulus is dis- 
criminably different from the old one. 
It is hypothesized that this tendency 
works along with S's other categorization 
habits, becoming manifest when the suc- 
cessive stimuli are ordered, with the 
number of stimuli being ‘“‘large’’ rela- 
tive to the number of judgment cate- 
gories. While Exp. III provides only an 
initial exploration of this relationship, the 
observed effects upon judgment 
dramatic enough to warrant 
study. 


were 
further 


SUMMARY 


Judgments of length were studied using 
the method of successive intervals, modified 
to permit independent manipulation of the 
order of stimulus presentation and the order 


of judgment. For both positively- and nega- 
tively-skewed distributions of stimuli, the 
judgment scales shifted toward the values of 
the stimuli presented first and also toward 
those judged first. The effects of both skew- 
ness and order of judgment were interpreted 
as consistent with adaptation-level theory, 
but the presentation effects required some ad- 
ditional assumptions concerning the occur- 
rence of covert judgments. Experimental 
variations, performed to evaluate these addi- 
tional assumptions, indicated that the pres- 
entation effects occur even when there have 
been no prior instructions for judgment. It 
was also found that the judgment effects were 
reduced by either an increase in the number 
of categories or a decrease in the number of 
stimuli. These results suggested that adap- 
tation-level theory may have to be elaborated 
in order to account for the following tend- 


ALLEN PARDUCCI 


encies: (a) to repeat the judgment previously 
applied to the stimulus, and (b) to switch the 
judgment whenever the stimulus is 
criminably different from the one 
preceded it. 


dis- 
which 
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BELVER C. GRIFFITH, HERMAN H. SPITZ, anp RONALD S. LIPMAN ! 
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Traditionally, Behaviorists have 
maintained that implicit verbal re- 
sponses act as mediators in thinking 
and problem solving. When recent 
writers have taken this position (Os- 
good, 1953; Underwood, 1952), it has 
usually been implemented with two 
assumptions: first, that stimulus ma- 
terials elicit hierarchies of associa- 
tions; and second, that these associa- 
tions are the bases for forming equiv- 
alences and distinctions, and furnish, 
by eliciting more remote associations, 
a means of arriving at new relation- 
ships. From these notions one would 
predict, with Osgood (1953), that the 
difficulty of abstraction tasks in which 
S must identify a similarity among 
several objects should be a function 
of the availability of common media- 
tors. That is, S’s success in reporting 
a similarity should reflect both the 
presence and relative position of an 
identical characteristic in the hier- 
archies of associations elicited by the 
presented stimuli. 

One form of this hypothesis was con- 
firmed by Underwood and Richardson 
(1956) incidental to development of 
scaled materials for concept formation. 
They found that it was possible to 
devise easy and difficult stimulus 
words for the same abstraction on the 
basis of the frequency with which the 
abstraction is given in a controlled 


1 The authors are indebted to Leonard S. 
Blackman of the Edward R. Johnstone Train- 
ing and Research Center, Ruport Hester of the 
Bureau of Social Research, Department of 
Institutions and Agencies, State of New 
Jersey, and David Zeaman of the University 
of Connecticut for helpful criticisms of the 
text and statistical procedure. 


association task to the stimulus words. 
Both the procedure and results can be 
made clear to the reader by examples 
from that study. Baseball, fang, 
paste, and sugar rarely elicited 
“‘white,’’ while milk, chalk, snow, and 
teeth frequently elicited this associate. 
When these sets of words were pre- 
sented to other groups of Ss as stimuli 
for drawing the abstraction ‘‘white,”’ 
the first set was found to be relatively 
more difficult. 

An exploratory study (Griffith & 
Spitz, 1958) employing retarded Ss 
confirmed a slightly different form of 
the hypothesis, namely, that an indi- 
vidual’s discovering a 
similarity among three words is re- 
lated to the number of these words 
that he usually defines in common with 
a suitable abstraction. 


success in 


It was found, 
as expected, that success in carrying 
out the task increased as the number 
of words defined in terms of an ac- 
ceptable abstraction increased. 

In addition, the earlier findings con- 
tained some evidence that reporting a 
similarity depended directly upon 
several of the words eliciting a single 
mediator. Support for this view lay 
in the fact that nearly all of the im- 
provement accompanied the change 
from one to two words defined in 
common with an acceptable abstrac- 
tion. Two is, of course, the lowest 
value that offers Ss an opportunity 
to match stimulus words on the basis 
of a single characteristic immediately 
elicited by the presented words. 

The purpose of the present study 
was to determine the relationship be- 
tween abstraction and word meaning 
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in normal Ss and to discover, if 
possible, the role of mediators in their 
performance. Accordingly, the ear- 
lier experiment was repeated here on 
two groups of normal children: one of 
roughly the same average mental age 
retarded group of the first 
study, and a second, younger group. 
In addition, the present study set out 
to determine whether the degree of 
retardation affects the function relat- 


as the 


ing abstraction to the number of words 
defined in common with an acceptable 


abstraction. For this purpose, addi- 


tional retarded Ss were tested. 


METHOD 
Subjects 


Retarded group.—Additional data on re- 
tarded Ss were obtained from a group of 18 
girls. These Ss, randomly selected from a 
resident population, were nearly equivalent 
in age and IQ to the males who served as Ss in 
the previous study (Griffith & Spitz, 1958). 
When the combined male and female group 
of 44 Ss, Group R, was divided at the median 
on the basis of IQ, the high group (R-high) 
had an average IQ of 74 (range 66 to 84) and 
an average age of 17-9 while the low group 
(R-low) had an average IQ of 56 (range 45 to 
64) and an average age of 16-9 

Normal groups.—Two groups were taken 
from regular classes in public schools. The 
group of 9 yr. Ss (N-9) consisted of 31 stu- 
dents, 17 boys and 14 girls, from a third grade 
class. The average age of this group was 9-3, 
and its average 1Q, based on group tests 
routinely administered as part of the school 
program, was 109. The equivalent MA was 
roughly 10-1 yr. which is comparable to the 
MA of approximately 9-9 inferred for the total 
retarded group. 

The group of 7-yr. Ss (N-7) consisted of 
25 students from a first grade class. This 
group consisted of 14 girls and 11 boys, with 
an average age of 7-0. Although there were 
no intelligence scores for the Ss in Group 
N-7, it seems safe to assume that the mean 
intelligence of the students in this group was 
nearly average. There were two students in 
the class who could not serve as Ss because 
of their inability to understand the instruc- 
tions, and the teacher stated that they were 
‘slow learners.”’ 
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Procedtre 


The procedure and scoring were exactly 
the same as in the previous experiment 
(Griffith & Spitz, 1958); therefore, only a 
brief description will be given here. The Ss 
attended two experimental sessions. In one 
session they received the abstraction test and 
in the other they received the definition test. 
The two sessions were approximately 24 hr 
apart; half the Ss were given the definition 
test and half the abstraction test at the first 
session. 

The abstraction test consisted of 6 triads 
of nouns, listed below, randomly inbedded in 
a longer test of 24 triads: 


coffee, tea, cocoa 

. bird, airplane, kite 
carrots, peaches, meat 
night, cave, closet 
elephant, mountain, whale 
pill, mosquito, pin 


Dae 


The S’s task was to report in what way the 
nouns in each triad were alike. 

In the definition test, the S was asked to 
define each of 24 words. Of these, 18 were 
taken from the six abstraction test triads, and 
6 were decoys. 

The primary object of the scoring proce- 
dure was to determine for these six triads the 
relationship between S’s definition of the 
nouns and his success or failure in drawing a 
correct abstraction from each group of three 
nouns. When S responded correctly to a 
triad of nouns in the abstraction test, his 
definitions of the three words in the triad 
were examined to determine how many in- 
cluded the abstracted property. Thus, if S 
responded: “‘They are all big,’ to elephant, 
mountain, and whale, the definitions of these 
words would be examined to see whether they 
included the word “‘big.’"” When S failed to 
attain an abstraction, the number of words 
defined in common with a possible abstraction 
was tabulated. In this case, if several possi- 
ble abstractions were mentioned in the 
definitions, only the most frequently occurring 
one was considered. The property given in 
the definition was considered the same as the 
abstraction only if the same words or exact 
synonyms were directly applied to the noun 
Actually, the Ss almost never used synonyms 
in referring to the same property. 


RESULTS AND DISCUSSION 


The relationship between the per- 
centage of correct abstractions and the 
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number of stimulus words S defines in 
common with an acceptable abstrac- 
tion has been plotted in Fig. 1 for 
Groups N-9, N-7, and R. Groups 
N-7 and R had essentially the same 
function ; the observed differences be- 
tween groups at every point were as- 
sociated with Mann-Whitney Us 
having Ps greater than .50 of occur- 
ring by chance. When either the 
N-7 or R function is compared with 
that of Group N-9, the differences, as 
tested by two-tailed U tests, approach 
statistical significance at Point 0 
(P < .15) and attain statistical sig- 
nificance (P < .01) at Point 1. 

Each point in these functions is an 
average in which single Ss are equally 
represented. Thus, an S who de- 
fined one word in common with an 
abstraction in two of the six triads 
would contribute a score of 100% to 
the appropriate average if he gave a 
correct abstraction to both triads, a 
score of 50% if he gave a correct ab- 
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Fic. 1. The percentage of correct ab- 
stractions shown as a function of the number 
of stimulus words defined in terms of an ac- 
ceptable abstraction for the three main groups. 
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straction to only one triad, and so on. 
This cumbersome method of tabula- 
tion was adopted to allow the number 
of independent measures at each 
point to be identified as the number 
of Ss whose data were averaged to 
determine that point. There still 
remained the problem of different 


combinations of triads being involved 
in the comparison of groups. 


When 
the data were pooled across groups for 
each item and the relationship be- 
tween abstraction and _ definition 
plotted for items, however, the triads 
proved to be more or less equivalent. 
Statistical comparisons of points on a 
single function cannot be made since 
every S in a group did not always 
contribute to all points.?. The difh- 
culty is, of course, that these com- 
parisons would be both 
correlated and independent measures. 
This restriction rules out the 
usual techniques for making over-all 
comparisons of 


based on 
also 
functions between 
groups. 

An inspection of the plotted data 
for Groups N-7 and R reveals the 
inflection that had been previously 
taken to indicate that mediation plays 
an important role in the S’s discovery 
of a similarity (Griffith & Spitz, 1958). 
Apparently, the opportunity to match 
stimulus words on the the 
immediate common associates elicited 
by the words is also important to the 


basis of 


performance of the younger group of 
normal Ss. A sharp rise between 
Points 1 and 2 does not occur, how- 
ever, in the function for normal nine- 
year-olds. The improve- 
ment in their performance accom- 


greatest 


? Of the 31 Ss in Group N-9, 21 contributed 
scores of Point 0, 22, to Point 1, 30 to Point 
2, and 30 to Point 3. The same counts are 
24, 17, 13, and 22 of the 25 Ss in Group N-7; 
and 40, 34, 26, and 41 of the 44 Ss in Group 
R. For Fig. 2, these counts are 22, 15, 8, and 
19 for the 22 R-low Ss; and 18, 19, 18, and 
22 for the 22 R-high Ss 
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Fic. 2. The percentage of correct ab- 
stractions shown as a function of the number 
of stimulus words defined in terms of an ac- 
ceptable abstraction for the high and low 
halves of the retarded group. 


WORDS 


panies the change from zero to one 
word defined in common and serves 
as an indication that these Ss make 
excellent use of the information they 
have when they define only one word 
in common with an acceptable ab- 


straction. This finding suggests a 
more mature use of mediators, 
namely, testing all associates against 
the stimulus words. 

Figure 2 displays the functions for 
the high and low halves of the re- 
tarded groups (R-high and R-low). 
The curves are similar in form in that 
most of the improvement accompanies 
the change from one to two words 
defined in terms of an acceptable ab- 
straction. As one might expect, the 
R-high function is higher in level, 
differences at Points 0 and 1 being as- 
sociated with Mann-Whitney Us 
which would rarely occur by chance 
(P < .05 for two-tailed tests). The 
design of the present experiment does 


not lend itself to a ready explanation 
for the higher level of performance of 
R-high Ss at Points0 and1. Further 
experiments are planned to clarify 
this issue. 


From the point of view of understand- 
ing mental deficiency, it is interesting to 
see that the R-low function approximates 
the curve which would be expected if the 
matching of common associates present 
in the definitions of several words were 
the only means of attaining an abstrac- 
tion, revealing an extreme reliance on 
this use of mediators. In this case, there 
would be zero correct at Points 0 and 1, 
where there is no opportunity to make a 
match and, of course, 100% correct at 
Points 2 and 3: It is easy to see (Fig. 2) 
that the R-low curve is nearly the same 
as that expected from theory. 

The usual assumption of clinicians that 
conceptual tasks are sensitive indicators 
of retardation is supported in these data 
by the presence of an inflection between 
one and two in the R-high function. 
Even these relatively mature and intelli- 
gent retardates appear to require an 
opportunity to match words on the basis 
of their immediately eliciting the same 
associate. 


SUMMARY 


The present study attempted to cast light 
on the relationship between concept formation 
and the availability of mediators. A group of 
retarded Ss, an equal MA group of 9-yr. 
normals, and a group of 7-yr. normals, were 
given an abstraction task in which they had 
to discover a similarity among three words. 
In a separate session, the stimulus words were 
presented to determine the number, in each 
triad, defined in terms of an acceptable ab- 
straction. 

Although the percentage of abstractions 
attained generally increased as the number of 
words defined in common with a possible 
abstraction increased, the nature of the rela- 
tionship varied among groups. Retardates 
and normal 7-yr. Ss were not very successful 
in concept attainment unless they had the 
opportunity to match words on the basis of 
their eliciting a common immediate associate ; 
i.e., unless they defined at least two words in 
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terms of an acceptable abstraction. Normal 
9-yr. Ss, on the other hand, were relatively 
successful even when they defined only one 
word in terms of an abstraction. 
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The question of the retention of 
remote associations was given theoret- 
ical prominence by McGeoch in con- 
nection with his differential forgetting 
hypothesis (McGeoch & Irion, 1952, 
p. 183 ff.). The hypothesis states 
that incorrect (including remote) as- 
sociations formed during learning, 
being weaker, will be forgotten at a 
faster rate than reinforced adjacent 
forward associations. While accept- 
this hypothesis has 
favored by its ability to explain many 


ance of been 
of the facts of reminiscence and dis- 
tribution of practice in a simple and 
plausible manner, several predictions 
made on the basis of differential for- 
getting have failed (Buxton, 1949; 
Riley, 1953; Underwood & Goad, 
1951; Wilson, 1943; Wilson, 1949). 
Only the work of Wilson (1943, 1949), 
however, represents a direct approach 
to the key problem of whether remote 
associations are forgotten faster than 
adjacent forward associations. Using 
the Method, he found 
equal rates of forgetting of adjacent 
forward and remote associations at 
intervals up to 20 min. following antic- 
ipation learning of 16-adjective lists. 

Two factors other than a failure of 
differential forgetting may have con- 
tributed to Wilson’s negative results. 
The first of these factors, recognized 
by Wilson (1949) is that since reminis- 
cence is unobtainable or markedly 
reduced when adjective lists are used 
(Buxton, 1949; Noble, 1950), ad- 
jective lists probably do not allow a 
fair differential 


Association 


test of forgetting. 


The second factor is a possible change 


he author is grateful to Leo Postman 
for his generous advice and encouragement 


in contextual stimuli in shifting from 
learning conditions to the Association 
Test. Such a change might result in 
the elimination of many weak remote 
associations, rendering the method 
insensitive to amounts of differential 
forgetting sufficient to affect recall or 
relearning. The experiment to be 
reported attempts to avoid the first 
objection by using nonsense syllables, 
and to avoid the second by extending 
the maximum interval between learn- 
ing and the Association Test to 48 hr., 
at which time it was hoped the effect 
would be detectable. That differ- 
ential forgetting or a similar process 
may continue over relatively long 
periods of time is suggested by the 
rise in 48-hr. retention ob- 
tained by Postman and Rau (1957). 


scores 


METHOD 


Materials The learning materials were 
the three low intralist similarity lists of non- 
sense syllables used by Underwood (1952) 

Procedure.—Each S read standard 
anticipation-learning instructions, and learned 
the list of nonsense syllables to a criterion of 
11/14 correct (79%). The 
syllables were presented on a memory drum 
ata rate with 4-sec. between trials. 
An asterisk preceded the first syllable, and S 
was instructed to spell the syllables. The 
Association Test was administered to different 
groups 30 sec., 20 min., or 48 hr. after the 
end of pracitce. The S was seated at 
another memory drum and told to respond to 
each syllable he shown with the first 
other syllable from the list that came to mind 
These instructions were a modified version of 
those used by Wilson (1949). The drum was 
set at a 4-sec. rate; each syllable remained 
visible through one turn of the drum, and was 
replaced on the second turn. This gave S 8 
sec. in which to respond; most Ss responded 
within the first 4 sec. Although the second 4 
sec. may have served as an opportunity for 


was 


anticipations 


2-sec. 


was 
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TABLE 1 
MEANS AND SDs OF LEARNING MEASURES 
Retention Interval 


Measure 20 Min. 


| 


Mean 


gp CO 
Trials to criterion 9.31 25.67 
Correct anticipations/trial ; .78 5.61 
Intralist errors/trial ’ > i .99 
Other errors/trial a A2 51 


rehearsal, this should not have affected the 
three groups differentially. 
Test 


For the Associ- 


ation lists, three random orders were 
made up, using a table of random numbers. 
The three lists and three orders were used in 
each of the retention intervals ina 3 xX 3 X 3 
The Ss in 


retention interval were given a 


factorial design, replicated twice 
the 20-min 
lengthy magazine article to read in the guise 
of a “test of reading speed and comprehen- 
The Ss in the 48-hr. group were in- 
structed to return in 48 hr. for such a test 
Subjects. 


sion.” 


The Ss were 54 students from 


undergraduate psychology classes at the 


University of California. The majority of 
them had had no prior experience in rote 
serial learning. They signed up for two hours 
of experimentation as part of a class require- 
ment, and were assigned to retention inter- 
vals, lists, and association orders in the order 
It was 
necessary to replace 8 Ss; 4 for not follow- 
ing instructions, and 4 who had not reached 
the 11/14 criterion within 40 min. of their 
arrival. Of these slow learners, two were in 


of their appearance in the laboratory. 


the 30-sec. group, two in the 20-min. group. 


Scoring of remote associations.—Scoring 
was similar to Wilson’s (1949). Each cor- 
rectly spelled syllable given on the Association 
Test was classified as an adjacent forward 
association if, during learning, it had followed 
syllable 


spelled syllables were assigned 


the stimulus All other correctly 
a degree of 
remoteness corresponding to the number of 
steps between the syllable in question and the 
stimulus syllable. The first and last syllables 
were considered one step removed from one 
another; otherwise, the list was treated as a 


linear series. 


RESULTS 


Learning.— Means and SDs of trials 
to criterion, correct anticipations per 
trial, intralist errors per trial, and 
other errors per trial, are presented in 
Table 1. An analysis of variance of 
trials to criterion revealed no differ- 
among groups, the only sig- 
nificant F ratios being those for Lists 
(F = 4.497, P < .05, 2/27 df), and 
the interaction of Lists, Association 
Orders, and Groups (F = 2.342, P « 
.05, 8/27 df). The significant differ- 
ence between lists indicates that the 
three Underwood lists are not of equal 
difficulty; no meaningful generaliza- 
tion can be suggested with respect to 
the significant interaction. Similar 
analyses of the other measures of 
Table 1 revealed no differences sig- 
nificant at the .05 level. There is a 
tendency, however, for the 20-min. 
group to be highest in correct anticipa- 
tions and intralist while the 
48-hr. group is lowest in intralist 
errors and other errors. The differ- 
ence in intralist errors is greatest 
toward the center of the list. The 
over-all tendency is for the 20-min. 
group to give most per 
trial, while the 48-hr. group gives 
fewest responses per trial. 

Remote associations.—T able 2 sum- 
marizes the results of the Association 
Test. 


ence 


errors, 


responses 


Backward remote associations 
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TABLE 2 


RESPONSES GIVEN ON THE ASSOCIATION TEST 


30 Sec. 


Classification 


Mean 


Adjacent forward 
Forward remote 
Backward remote 
“Far remote’’* 
Failures to respond 


Other 


Retention Interval 


«*Far remote” associations are all remote associations whose degree of remoteness is greater than +3 or —2. 


occur in slightly greater numbers than 
forward remote associations. Fail- 
ures to respond and errors other than 
intralist errors were so infrequent that 
only means are given for the three 
retention intervals. Plotting fre- 
quency of associations of different 
degrees of remoteness, the usual gradi- 
ent was found; more associations of 
low degrees of remoteness being given 
than associations of high degrees of 
remoteness. Even after correction 
for opportunity, there was evidence 
of a remote association gradient ex- 
tending forward and backward over at 
least two degrees of remoteness. The 
effect of an additional correction for 
serial position effects (to eliminate 
possible biases resulting from the fact 
that few errors are made in response 
to the end items) was negligible. 
Differential forgetting.—Iin view of 
the tendency mentioned above for Ss 
in different groups to give different 
numbers of responses per trial in 
learning, mean responses per trial 
were used in covariance analyses of 
remote associations. Two analyses 
were performed. The first, on the 
proportion of adjacent forward as- 
sociations, was analogous to Wilson’s 
(1949) test of differential forgetting. 


For the second analysis, proportions 
of “far remote’’ associations were ob- 
tained. “Far remote” associations 
were those which exceeded Degree 3 
in the forward direction and Degree 2 
in the backward direction. While 
such a division is arbitrary in some 
ways, there seemed to be two reasons 
for making it. First, the gradient of 
frequency vs. degree of remoteness 
was most pronounced for low degrees 
of remoteness; and second, increased 
sensitivity to differential forgetting 
might be expected with such a meas- 
ure, in view of Patten’s (1938) ob- 
servation that under distributed prac- 
tice only errors of four degrees of 
remoteness or more show a decrease 
in comparison with a massed practice 
condition. It should be noted that 
this was not confirmed by Underwood 
and Goad (1951). All proportions 
were submitted to an arcsine trans- 
formation. Using a _ pooled error 
term, there was no differential forget- 
ting significant at the .05 level in 
either analysis, although in the analy- 
sis of “far remote’’ associations, the 
obtained F ratio was barely short of 
significance (F = 3.119, 2/50 df, with 
an F of 3.180 required for significance 
at the .05 level). 
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DISCUSSION 


The results confirm Wilson’s (1943, 
1949). While the reduction in frequency 
of the more remote associations, when 
compared with the total number of asso- 
ciations, is extremely close to significance, 
there is still little support, if any, for 
differential forgetting. 

A disadvantage of both Wilson's and 
this experiment is that there has been 
very little forgetting of any sort. It may 
well be that only when adjacent forward 
associations have been forgotten can we 
predict with any assurance that remote 
associations will have been forgotten 
faster or to a greater degree. However, 
it may be that the Association Method 
will not reveal differential forgetting 
even when adjacent forward associations 
are forgotten; McGeoch (1939), in an 
experiment reported in abstract form 
only, found more forgetting of adjacent 
forward than of remote associations after 
interpolation—but it is probable that 
interpolation is not a satisfactory parallel 
to the conditions under which differential 
forgetting is assumed to occur. 


SUMMARY 


In an attempt to test the differential for- 
getting hypothesis, remote associations were 
obtained by the Association Method 30 sec., 
20 min., or 48 hr. after a 14-syllable list of low 
intralist similarity had been learned to a cri- 
terion of 11/14 correct anticipations. In 
terms of the proportion of adjacent forward 
associations, there was no differential forget- 
ting; however, there was a decrease in the 
frequency of the more remote associations 
which barely missed significance at the .05 
level. 


REFERENCES 


Buxton, C. E. Repetition of two basic ex- 
periments on reminiscence in serial verbal 
learning. J. exp. Psychol., 1949, 39, 676- 
682. 

McGeocnu, J. A. Remote associations as a 
function of interpolated learning. Psychol. 
Bull., 1939, 36, 545 

McGeocu, J. A., & Irton, A. L. The psy- 
chology of human learning. New York: 
Longmans, Green, 1952. 

Nosie, C.E. Absence of reminiscence in the 
serial rote learning of adjectives. J. exp. 
Psychol., 1950, 40, 622-631. 

PaTTEN, E. F. The influence of distribution 
of repetitions on certain rote learning phe- 
nomena. J. Psychol., 1938, 5, 359-374. 

PosTMAN, L., & Rau, L. Retention as a 
function of the method of measurement. 
Univer. Calif. Publ. Psychol., 1957, 8, 
217-270. 

Ritey, D. A. Reminiscence effects in paired- 
associate learning. J. exp. Psychol., 1953, 
45, 232-238 

JNDERWOOD, B. J. Studies of distributed 
practice: VII. Learning and retention of 
serial nonsense lists as a function of intralist 
similarity. J. exp. Psychol., 1952, 44, 
80-87. 

INDERWOOD, B. J., & Goap, D. Studies of 
distributed practice: I. The influence of 
intralist similarity in serial learning. J. 
exp. Psychol., 1951, 42, 125-134. 

Witson, J. T. Remote associations as a 
function of the length of interval between 
learning and recall. J. exp. Psychol., 1943, 
33, 40-48. 


Witson, J. T. The formation and retention 
of remote associations in rote learning 
J. exp. Psychol., 1949, 39, 830-838. 


(Received October 16, 1958) 











